The line address 0 is a non-standard extension of the GNU implementation of sed
that allows doing things like 0,/pattern/ x
to run the x
action on lines from the start of the input(s) (or start of the file with -s
or -i
, both also GNU extensions) to the first line that matches pattern
even if that line is the first line.
With the standard 1,/pattern/ x
that would run x
on lines 1 to the first line after that that match the pattern
, so not do what you want when line 1 matches the pattern.
GNU sed
has a couple of other extensions involving the ~
character, where you can specify a step when selecting addresses. 5~3
is to select every 3
rd line starting with the 5
th one and 12,~4
to select line 12
to the first line after that that's a multiple of 4
(here 16).
For some reason, in the latter, it doesn't allow the first address to be 0¹ which causes the error you're getting (same for 0,4
), though for those, you'd just write 1,4
.
Now, as @choroba said, the correct syntax would be:
date_pattern=$(
LC_ALL=C date -d '30 days ago' +'^%d/%b/%Y:'
)
sed "0,\~${date_pattern}~ d"
where you need to prefix the first address delimiter with \
if you want to use one other than /
(also note the LC_ALL=C
which you need to guarantee the month name abbreviations will be in English, the %d
instead of %e
as already noted by @choroba, the ^
to anchor the search at the start of the line and the -d
(marginally more portable than --date
) option going before the non-option +^%d/%b/%Y
so it still works with GNU sed
when in a POSIX environment).
But that d
eletes from the start of the input(s) to the first that matches the regular expression stored in $date_pattern
, which means it would delete the first line (the timestamp) of the first log entry from that day which is not what you want.
Here it would be better to use the standard and portable (without GNU extension):
sed "\~${date_pattern}~,\$!d"
That is d
elete lines except (!
) the first line from that day up to the last ($
which we escape with \
as $
is special to the shell inside double quotes).
That still assumes there is at least one line in the input from that day, and it may still output lines from earlier days if the logs are not guaranteed to be in chronological order².
¹ and ADDR1,~0
seems to be the same as ADDR1
for some reason!?
² which is not uncommon. That typically happens when the timestamp is for the start of an event (like when an HTTP request is received in apache server logs) but the log entry is added when the event is over (the HTTP response has been sent), and several events can happen in parallel.