2

I am parsing some log files and have grepped out the errors. Each line looks something like this:

CreateOrder_hostname1.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error description [system]: Method1
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 2 [system]: Method2
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 3 [system]: Method3
ShipOrder_hostname3.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ShipOrder: [1443555726715] Error description 4 [system]: Method4

How can I take this line and make it read something like this?

CreateOrder: 2015-09-29 15:42:06: Error description
ScheduleOrder: 2015-09-29 15:42:06: Error description 2
ScheduleOrder: 2015-09-29 15:42:06: Error description 3
ShipOrder: 2015-09-29 15:42:06: Error description 4
7
  • Do you want the description of the error or just the string Error description? The latter seems to be what you're asking for but doesn't seem very useful. Commented Sep 30, 2015 at 15:49
  • I want the actual error description from the line. In this example it is "Error description" but it could be any string. Commented Sep 30, 2015 at 15:51
  • Is it always CreateOrder in the beginning or can this string differ? Commented Sep 30, 2015 at 15:52
  • 1
    OK, but what will be the same for all lines? What can we use to anchor the regular expression? Will the error description always be between [ and ]? Will it always be between the penultimate and last :? Will it always have [foo] right after it? Please show us some more examples so we don't give you something that only works on this one. Commented Sep 30, 2015 at 15:53
  • My apologies, I only gave one line of my log file. It could be anything and I listed a few more examples. Commented Sep 30, 2015 at 15:55

4 Answers 4

6

With sed:

sed 's/^\([^_]*\)_[^:]*:\([^,]*\)[^]]*\]\([^[]*\).*/\1: \2:\3/'
  • ^\([^_]*\) match the start of the line ^. Then the part inside the brackets \(...\) is saved to sed internal variable \1:
    • [^_]* match any character which is not a underscore _ zero or more times *.
  • [^:]* this is followed by any character which is not a :.
  • \([^,]*\) again inside brackets and saved to the variable \2: every character until the , after the date.
  • [^]]*\] parsing continues until a ] appears (before the error description).
  • \([^[]*\) then match everything until the next opening square brachet [ and save it to \3.
  • \1: \2:\3 now replace everyting with the formatted output and the values of the variables \1, \2 and \3.

The output:

CreateOrder: 2015-09-29 15:42:06: Error description 
ScheduleOrder: 2015-09-29 15:42:06: Error description 2 
ScheduleOrder: 2015-09-29 15:42:06: Error description 3 
ShipOrder: 2015-09-29 15:42:06: Error description 4 
2
  • Thank you for the answer and explanation. Another answer was in slightly before yours but I appreciate your answer and you taking the time to explain what you wrote. Commented Sep 30, 2015 at 16:23
  • @Matt No problem, knowledge sharing is the primary aspect on this site, not reputation. Commented Sep 30, 2015 at 17:30
5

This should work:

$ perl -pe 's/^(.+?)_.+?:(.+?),.*?\](.+?)\[.*/$1: $2:$3/' file 
CreateOrder: 2015-09-29 15:42:06: Error description 
ScheduleOrder: 2015-09-29 15:42:06: Error description 2 
ScheduleOrder: 2015-09-29 15:42:06: Error description 3 
ShipOrder: 2015-09-29 15:42:06: Error description 4 

Explanation

  • perl -pe: the -p means "print everyline after applying the script given by -e"
  • s/^(.+?)_.+?:(.+?),.*?\](.+?)\[.*/$1: $2:$3/ : the regular expression looks for everything up to the first _ (.+?_) and saves that as $1. Then, everything until the first : and everything after that until the first comma (.+?,) is saved as $2. It then skips until the first ] (.*?\]) and captures everything after that until the first [ (.+?\[)as $3. Finally, it also matches everything until the end of the line. All this is replaced with $1: $2: $3.
3
  • Same concept as mine but, s/sed/perl/... +1 Commented Sep 30, 2015 at 16:16
  • @chaos great minds :) I've always wondered why sed doesn't implement non-greedy matches. They are very useful. Commented Sep 30, 2015 at 16:19
  • Thank you for the reply and explanation. This works nicely. Commented Sep 30, 2015 at 16:28
1

Other way is to remove unnecessary patterns than remain necessary

sed 's/_[^:]*:/: /;s/,[^]]*\]/:/;s/\[.*//'

Outputs:

CreateOrder: 2015-09-29 15:42:06: Error description 
ScheduleOrder: 2015-09-29 15:42:06: Error description 2 
ScheduleOrder: 2015-09-29 15:42:06: Error description 3 
ShipOrder: 2015-09-29 15:42:06: Error description 4
0

It's hard to tell what you're looking for, since the error description just seems to say Error Description. This keeps it and the identifying stuff around it:

sed 's/[_,][^:-]*:/ /g
' <<\IN
CreateOrder_hostname1.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error description [system]: Method1
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 2 [system]: Method2
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 3 [system]: Method3
ShipOrder_hostname3.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ShipOrder: [1443555726715] Error description 4 [system]: Method4
IN

...that prints...

CreateOrder 2015-09-29 15:42:06 ERROR  :Thread-26  [1443555726715] Error description [system]: Method1
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26  [1443555726715] Error description 2 [system]: Method2
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26  [1443555726715] Error description 3 [system]: Method3
ShipOrder 2015-09-29 15:42:06 ERROR  :Thread-26  [1443555726715] Error description 4 [system]: Method4

I dunno if that's too much or too little, or if it's even on the right track. I played around with dropping the boxed stuff, too.

sed 's/[_,[][^]:-]*[]:]/ /g
' <<\IN
CreateOrder_hostname1.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error description [system]: Method1
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 2 [system]: Method2
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 3 [system]: Method3
ShipOrder_hostname3.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ShipOrder: [1443555726715] Error description 4 [system]: Method4
IN

...that prints...

CreateOrder 2015-09-29 15:42:06 ERROR  :Thread-26    Error description  : Method1
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26    Error description 2  : Method2
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26    Error description 3  : Method3
ShipOrder 2015-09-29 15:42:06 ERROR  :Thread-26    Error description 4  : Method4

...which looks like the kinda stuff I'd wanna see, maybe.

This one drops the Description bit entirely, but maybe still tells the same story? Bear in mind, it is kind of difficult to match a string which you say could be anything and which doesn't seem to serve any real purpose either. Anyway, it's fun, too.

sed 's/[_,][^-]*[^ ]:/ /g
' <<\IN
CreateOrder_hostname1.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error description [system]: Method1
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 2 [system]: Method2
ScheduleOrder_hostname2.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ScheduleOrder: [1443555726715] Error description 3 [system]: Method3
ShipOrder_hostname3.domain.com_201509291530_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_ShipOrder: [1443555726715] Error description 4 [system]: Method4
IN

CreateOrder 2015-09-29 15:42:06 ERROR  :Thread-26  Method1
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26  Method2
ScheduleOrder 2015-09-29 15:42:06 ERROR  :Thread-26  Method3
ShipOrder 2015-09-29 15:42:06 ERROR  :Thread-26  Method4

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.