1

I have hit a wall with my limited sed scripting abilities, and i wonder if any of you guys could help me out.

I have a non-standard apache access log with the following format:

#Version:   1.0
#Fields:    c-ip date time cs-method cs-uri sc-status time-taken bytes
#Software:  WebLogic
#Start-Date:    2014-07-21  11:21:59

10.000.000.000  2014-07-21  11:22:16    GET /em/skins/login.css 200 0.1 1091
10.000.000.000  2014-07-21  13:55:36    POST    /sbconsole/sbconsole.portal?_nfpb=true&_pageLabel=Projects_ViewProjects&ProjectsPortlet=    200 0.766   519376

The script i have mashed together is:

sed -i  's/[[:space:]]\+/ /g;s/\([0-9][0-9][0-9][0-9]\)\([0-9][0-9]\)\/\([0-9][0-9]\)/\3-\2-\1/;s:-:/:g' log.access 

But I have hit a wall, and would love some help so I could end with the following format in the access log :

10.000.000.000 - - [21/07/2014:11:22:16 +0200] "GET /em/skins/login.css HTTP/1.1" 200 1091
10.000.000.000 - - [21/07/2014:13:55:36 +0200] "POST /sbconsole/sbconsole.portal?_nfpb=true&_pageLabel=Projects_ViewProjects&ProjectsPortlet= HTTP/1.1" 200 519376

Just FYI... I have multiple different IPs that does GET/POST.


The following awk line got me the output that i wanted

awk '!/^#/ && NF{split($2,a,"-"); printf "%s - - [%s/%s/%s:%s] \"%s %s\" %s %s %s\n", $1, a[3], a[2], a[1], $3" +200", $4, $5" HTTP/1.1", $6, $7, $8}' alm_server1_51100_access.log > test.test

All thanks to fedorqui

1 Answer 1

3

Nice markup:

awk '!/^#/ && NF
     {
      split($2,a,"-")
      printf "%s - - [%s/%s/%s:%s] \"%s %s\" %s %s\n", $1, a[3], a[2], a[1], $3, $4, $5, $6, $7
     }' file

If your input is just the lines starting with 10.000... this makes it:

$ awk '{printf "%s - - [%s:%s] \"%s %s\" %s %s\n", $1, $2, $3, $4, $5, $6, $7}' file
10.000.000.000 - - [2014-07-21:11:22:16] "GET /em/skins/login.css" 200 0.1
10.000.000.000 - - [2014-07-21:13:55:36] "POST /sbconsole/sbconsole.portal?_nfpb=true&_pageLabel=Projects_ViewProjects&ProjectsPortlet=" 200 0.766

If you also want to skip the empty lines and those starting with #, then this makes it:

awk '!/^#/ && NF{printf "%s - - [%s:%s] \"%s %s\" %s %s\n", $1, $2, $3, $4, $5, $6, $7}' file

Both approaches use the same printf format, that goes through the fields adding the quotes, brackets, dashes that you want.

To format the data differently, use split() and move the elements of the subsequent array a[]:

$ awk '!/^#/ && NF{split($2,a,"-"); printf "%s - - [%s/%s/%s:%s] \"%s %s\" %s %s\n", $1, a[3], a[2], a[1], $3, $4, $5, $6, $7}' file
10.000.000.000 - - [21/07/2014:11:22:16] "GET /em/skins/login.css" 200 0.1
10.000.000.000 - - [21/07/2014:13:55:36] "POST /sbconsole/sbconsole.portal?_nfpb=true&_pageLabel=Projects_ViewProjects&ProjectsPortlet=" 200 0.766
5
  • the first one works great ( after some small ugly adjustments ) I am just missing how to change the date from: 2014-07-21 to 21/07/2014 Commented Sep 12, 2014 at 12:02
  • @Acehege uhms, true, didn't notice that. See update with gsub(). Commented Sep 12, 2014 at 12:07
  • that worked great! You are a star! just one more thing ;), how can I change the format so it is: date/month/year instead of year/month/date ? Commented Sep 12, 2014 at 12:22
  • @Acehege see update. Commented Sep 12, 2014 at 12:35
  • 1
    that worked! Thank you so much for the help. I will mark this as answered and edit my post with the line that worked for me! Commented Sep 12, 2014 at 12:40

You must log in to answer this question.