2

My log file looks like the following sample:

10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 14/Aug/2020:23:33:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

I want to search the above entries by specifying a date range, like below:

./Logsearch.sh 10/Aug/2020 13/Aug/2020

Expected result:

10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

How can I do this?


Any idea how to write script for my query.May OS is solaris 11.Please provide some sample script.

3
  • 1
    Date calculattions are hard. Your best best is to translate dates like "10/Aug/2020" into an internal format (e.g. Unix time, i.e. seconds since the "epoch") that can be directly compared and use the string format only for input and for output. I'd also recommend that you use a better scripting language than bash: python or perl should be much easier to use for this, and many other tasks. Commented Aug 15, 2020 at 0:42
  • If you're on a system with journalctl, you can use --since= and --until Commented Aug 15, 2020 at 1:05
  • Hi,,i have no idea about journalctl.Can u give some idea to resolve in bash script?? Commented Aug 16, 2020 at 2:54

4 Answers 4

1

That looks like a standard HTTP access log, so why not use grep to match a pattern of the dates you want?

$ grep '1[0-3]/Aug/2020' access_log

10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

The grep pattern '1[0-3]/Aug/2020' uses the range expression [0-3]. This expression matches a single character which can take the values 0,1,2,3. Combine that with the rest of the expression, and you get 10/Aug/2020, 11/Aug/2020, 12/Aug/2020 and 13/Aug/2020 as the possible patterns. grep will print out the lines from the log that match these patterns.

7
  • Hello Hexiel, No. I want to pass an argument date range between like "10/Jul/2020" "15/Aug/2020". So how to grep the lines in between these range in the above log.??any help Commented Aug 15, 2020 at 4:49
  • Okay, if your input format is fixed that way, then you need a program that can understand dates and calculate their differences. The GNU date program can do this to some extent, but there are limitations. You can also consider using a language better suited to the task, like python. Commented Aug 15, 2020 at 6:05
  • Hello Hi, Any other option to achieve this task by looping with grep command?? Commented Aug 15, 2020 at 6:46
  • It's possible. The date command allows strings like '+1 day', so it's possible to use the start date and then keep incrementing until the end date is reached. You could then use grep for each date in the range. You may have to do some conversions so that the output from date will match the date format in the logs. Commented Aug 15, 2020 at 7:22
  • 2
    Sorry, I cannot write an entire script for you. This site simply does not work that way. You should prepare your own script based on the information that I and the others have provided. Once you have a prototype, you can come back and ask specific questions about any problems that you are stuck on. Commented Aug 16, 2020 at 2:58
0

You could use a specialized structured text tool as Miller (https://github.com/johnkerl/miller) and run

mlr --nidx then filter 'strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") >="2020-08-11" && strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") <="2020-08-13"' input.txt

to have

10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

I have applied a filter to have all between 2020-08-11 and 2020-08-13

Some notes:

  • --nidx to set the input and output format (https://bit.ly/3h3UvN3)
  • filter to apply filter;
  • strftime(strptime($4,"%d/%b/%Y:%H:%M:%S"),"%Y-%m-%d") >="2020-08-11" is one of the filter. Using strptime I set the input date format (%d/%b/%Y:%H:%M:%S) of the fourth field ($4). Using strftime I change the date format in %Y-%m-%d
1
  • Hi..Sorry we are in big IT sector company..Not allowed to install any external tool in the prod server.Please help to advice in like,AWK,GREP,FIND to achieve the above.Thanks in advance. Commented Aug 16, 2020 at 9:46
0

Using Raku (formerly known as Perl_6)

~$ raku -e 'my $start_date = DateTime.new("2020-08-11").in-timezone(28800);  \
            my $stop_date  = DateTime.new("2020-08-13").in-timezone(28800);  \
            my @a = lines;  my @b = do for @a.map(*.words) { .[3..4].join    \
                .subst(/^ (\d**2) \/ (Aug) \/ (\d**4) \: /, {"$2-08-$0T"} )  \
                .subst(/ (\+\d**2) (\d**2) $/, {"$0:$1"} ).DateTime  };      \
            for @b.kv -> $k,$v {put @a[$k] if $v ~~ $start_date .. $stop_date};'  file

#OR:

~$ raku -e 'my $start_date = DateTime.new("2020-08-11").in-timezone(28800);     \
            my $stop_date  = DateTime.new("2020-08-13").in-timezone(28800);     \            
            my @a = lines;  my @b = @a.map(*.words).map: { (                    \
              .[3].subst(/^ (\d**2) \/ (Aug) \/ (\d**4) \: /, {"$2-08-$0T"} ),  \
              .[4].subst(/ (\+\d**2) (\d**2) $/, {"$0:$1"} )).join.DateTime };  \
            for @b.kv -> $k,$v {put @a[$k] if $v ~~ $start_date .. $stop_date};'   file 

Above are answers written in Raku, a member of the Perl-family of programming languages. Raku has ISO 8601 DateTime objects built-in, and can search for DateTimes within a range.

Briefly, lines are read-in and saved in @a array. Using (whitespace-separated) words, Dates and Times are extracted out of each line of @a array and converted to an ISO 8601 DateTime object, which is saved in @b array (the Date/Time elements are found in zero-indexed columns .[3] and .[4] ).

Then each line is tested to see whether that DateTime object falls within a desired $start .. $stop range. Note the special in-timezone function, which computes "the offset in seconds from GMT" (28800 seconds is +0800 hours). Basically in the last line, each @b DateTime is zero-indexed with .kv and tested individually: if the computed DateTime is within the desired range, the corresponding/proper @a[$k] full line is returned.

Sample Input:

10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 10/Aug/2020:23:45:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 13/Aug/2020:23:43:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 14/Aug/2020:23:33:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

Sample Output (both code examples):

10.434.22.334 - unauthenticated 11/Aug/2020:23:34:45 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74
10.434.22.334 - unauthenticated 12/Aug/2020:23:45:43 +0800 "GET /eai/random.jsp HTTP/1.1" 200 74

https://www.iso.org/iso-8601-date-and-time-format.html
https://docs.raku.org/routine/in-timezone
https://docs.raku.org
https://raku.org

0

This is a good task for Perl and Time::Piece builtin module:

perl -MTime::Piece -sne '
    BEGIN {
        my $t = localtime;
        our $monthsRe = join "|", $t->mon_list;
        our $since = Time::Piece->strptime($since, "%d/%b/%Y")->epoch;
        our $until = Time::Piece->strptime($until, "%d/%b/%Y")->epoch;
    }
    if (m!(\d{2}/(?:$monthsRe)/\d{4}:\d{2}:\d{2}:\d{2})\s([+-]\d{4})!) {
        my $d = Time::Piece->strptime("$1 $2", "%d/%b/%Y:%H:%M:%S %z")->epoch;
        if ($d >= $since && $d <= $until) {
            print;
        }
    }
' -- -since=10/Aug/2020 -until=13/Aug/2020 access_log

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.