1

I have a record like

192.168.28.168  user82  [08/May/2010:09:52:52]  "GET /NoAuth/js/titlebox-state.js HTTP/1.1"     "http://www.example.com/index.html"     "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0" 

I want the final out put be like display only

   /NoAuth/js/titlebox-state.js HTTP/1.1

I use this command and can get the following

cut -f4 example.log

"GET /NoAuth/js/titlebox-state.js HTTP/1.1"

but, I need to remove ["GET] as well, how can I do that with cut or awk or sed?

3 Answers 3

2

Awk approach:

awk '{ sub(/"/, "", $6); print $5, $6 }' file

The output:

/NoAuth/js/titlebox-state.js HTTP/1.1
0

Sed approach:

sed -n 's/.*"GET \([^ ]* HTTP\/[0-9\.]*\)".*/\1/p' example.log

It searches for *"GET (<no-whitespaces> HTTP/<digits-and-dots>)"* and returns matches inside round brackets.

0

Alternative approach with gnu grep and Perl regexps:

$ echo "$a"
192.168.28.168  user82  [08/May/2010:09:52:52]  "GET /NoAuth/js/titlebox-state.js HTTP/1.1"     "http://www.example.com/index.html"     "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

$ echo "$a" |grep -Po '(?<=GET ).*(?=".*"http)'
/NoAuth/js/titlebox-state.js HTTP/1.1
$#or
$ echo "$a" |grep -Po '(?<=GET).*(?=".*"http)'
 /NoAuth/js/titlebox-state.js HTTP/1.1 #leading space preserved

(?<=GET ) == lookbehind for word GET & space
.* == match any char zero or more times after lookbehind and till lookahead
(?=".*"http) == lookahead for " & any char zero or more times & "http

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.