Skip to main content
added 467 characters in body
Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

Also beware that getline file, contrary to read -r line doesn't strip leading and trailing spaces and tabs from the input line. If you wanted them to be stripped you'd have to do it manually:

getline file
sub(/^[ \t]*/, "", file)
sub(/[ \t]*$/, "", file)

For instance.

Another difference with your while read loop is that if the last line is not delimited, it would still be processed by awk, but discarded by a while read sh loop.

Also beware that getline file, contrary to read -r line doesn't strip leading and trailing spaces and tabs from the input line. If you wanted them to be stripped you'd have to do it manually:

getline file
sub(/^[ \t]*/, "", file)
sub(/[ \t]*$/, "", file)

For instance.

Another difference with your while read loop is that if the last line is not delimited, it would still be processed by awk, but discarded by a while read sh loop.

Source Link
Stéphane Chazelas
  • 584.6k
  • 96
  • 1.1k
  • 1.7k

You could do something like:

#! /usr/bin/awk -f
BEGIN {
  ARGC = 1
  while ((getline file < "awk.data") > 0)
    ARGV[ARGC++] = file
  "date +%Y-%m-%d" | getline date
}
FNR == 1 {
  line_to_print = 0
}
line_to_print {
  if (FNR == line_to_print) {print; nextfile}
  next
}
index($0, date) {line_to_print = FNR + 10}

nextfile is not POSIX yet, but will be in the next version. The code above still works in awk implementations that don't support nextfile (in which case it's still valid code, but that does nothing).

Note that POSIX doesn't specify the shebang mechanism nor the path of the awk utility. #! /path/to/awk -f shebangs are not reliable as upon invocation, a that-script -x becomes /path/to/awk -f /path/to/that-script -x, where the -x could be treated as an option by awk (and an argument like '-eBEGIN{system("reboot")}' would reboot with the GNU implementation of awk for instance.

In "date..." | getline date, awk does invoke sh to invoke the command line, so that does not remove sh from the equation. awk cannot run commands without the help of sh. GNU awk can format the current date but it's not standard. You can get the current date as epoch time with srand() POSIXly (but OpenBSD is not POSIX in that regard), but then converting that to YYYy-MM-DD format in the user's timezone would be quite difficult. perl would likely be a much better language than awk here if the point is to avoid sh.

Beware that if the lines of awk.data are in the foo=bar.html format, awk will treat them as variable assignment instead of paths of files to process. If that may be the case, you could sanitise those path in the BEGIN statement with:

function sanitise(path) {
  if (path != "" && path !~ /^\//)
    return "./" path
  else
    return path
}

(and use ARGV[ARGC++] = sanitise(file) instead of ARGV[ARGC++] = file).