0

how to generate the following file ( example in ) to the file as described in example out ,

each last word in state line ( example in ) , should be added to the last line of the previos line

example in

HDFS  worker01.gtdns.com
state  STARTED
HDFS  worker02.gtdns.com
state  STOP
HDFS  worker03.gtdns.com
state  STARTED
HDFS  worker05.gtdns.com
state  STARTED
HDFS  worker06.gtdns.com
state  STARTED
HDFS  worker07.gtdns.com
state  STARTED
HDFS  worker08.gtdns.com
state  STARTED
HDFS  worker09.gtdns.com
state  STOP

example out ( expected results )

HDFS  worker01.gtdns.com STARTED
HDFS  worker02.gtdns.com STOP
HDFS  worker03.gtdns.com STARTED
HDFS  worker05.gtdns.com STARTED
HDFS  worker06.gtdns.com STARTED
HDFS  worker07.gtdns.com STARTED
HDFS  worker08.gtdns.com STARTED
HDFS  worker09.gtdns.com STOP
1
  • bash is not a text editor. Commented Jan 8, 2018 at 22:51

6 Answers 6

1
awk '$1 == "HDFS" { printf( "%s ", $0 ) }; $1=="state" { print $2 }' /path/to/input

The awk script is fairly self-explanatory: On lines where the first field is HDFS, append a space to the line and print it as-is with no trailing newline. On lines where the first field is state, print the second field with the (implied) trailing newline.

5
  • What's the reason for removing the sed and perl tags ? They're just as valid as awk here... Commented Jan 8, 2018 at 22:55
  • I thought I only removed the sed tag, but may have misdoubleclicked and caught perl's tag inadvertently. Feel free to re-add. Commented Jan 8, 2018 at 22:56
  • I don't want to re-add anything. As I said I'd like to know what's wrong with using e.g. the sed tag here... Commented Jan 8, 2018 at 22:58
  • Using sed to parse multi-line input is generally more complicated and prone to error than it's worth in my opinion, and given the nature of the OP, probably not the right hammer for this particular nail. Commented Jan 8, 2018 at 22:59
  • That's your opinion - a very subjective one at that. Commented Jan 8, 2018 at 23:19
0

Short GNU AWK approach:

awk -v RS='[[:space:]]+state' '{ printf "%s", $0 }' file
  • -v RS='[[:space:]]+state' - treat state substring with leading whitespace(s) [[:space:]]+ as input record separator RS

The output:

HDFS  worker01.gtdns.com  STARTED
HDFS  worker02.gtdns.com  STOP
HDFS  worker03.gtdns.com  STARTED
HDFS  worker05.gtdns.com  STARTED
HDFS  worker06.gtdns.com  STARTED
HDFS  worker07.gtdns.com  STARTED
HDFS  worker08.gtdns.com  STARTED
HDFS  worker09.gtdns.com  STOP

For a "2-lined" static format - you may also try the following SED approach:

sed '/^[[:space:]]*HDFS/{ N; s/[[:space:]]*state // }' file
1
  • The awk option accepts lines that have anything other than HDFS. Probably it doesn't matter, but worth knowing. Commented Jan 9, 2018 at 0:15
0

Using ex, the POSIX-specified scriptable file editor:

printf '%s\n' 'g/state/s/^ *state *//|-j' x | ex file.txt

The s command is a standard substitution. The -j means "on the previous line (-), execute the join command (j)" which joins the subsequent line with a space separation.

Actually, because the join command ignores leading spaces on the line to be joined, and because s reuses the previous regex if no regex is supplied, the following command works just as will and gives the same result:

printf '%s\n' 'g/state/s///|-j' x | ex file.txt

Note that this saves the changes to the file. To view the changes without saving them, use the following instead:

printf '%s\n' 'g/state/s///|-j' %p | ex file.txt
0

gnu awk golfing:

$ awk '1' RS='\n[ \t]*state ' ORS='' file

Testing:

$ awk '1' RS='\n[ \t]*state ' ORS='' file
HDFS  worker01.gtdns.com STARTED
HDFS  worker02.gtdns.com STOP
HDFS  worker03.gtdns.com STARTED
HDFS  worker05.gtdns.com STARTED
HDFS  worker06.gtdns.com STARTED
HDFS  worker07.gtdns.com STARTED
HDFS  worker08.gtdns.com STARTED
HDFS  worker09.gtdns.com STOP

RS is the Input record separator
ORS is the output record separator

0
0

Assuming there are always one state after the HDFS, This solve the issue:

awk '$1=="HDFS"{l=$0;next};$1=="state"{print(l,$2);l=""}' file

$1=="HDFS"{ … } For lines the field 1 is HDFS do .
l=$0;next Store line in var l (elle, line) move to next line.
$1=="state"{ … } for lines that field 1 is state do …
{print(l,$2)} print the line stored in var l (elle) and field 2.
{l=""} Avoid printing stale (old) values of l.

5
  • golfing based on the condition{action} awk default synthax: awk '$1=="HDFS"{l=$0}$1=="state"{print(l,$2)}' Commented Jan 8, 2018 at 23:32
  • Thanks @GeorgeVasiliou answer edited, description added. Commented Jan 8, 2018 at 23:43
  • Just for fun, this can be golfed more if one state come always after HDFS: awk '$1=="state"{print l,$2}{l=$0}'. Give it a try. Commented Jan 8, 2018 at 23:48
  • Good idea. @GeorgeVasiliou Probably not much aster, but I like this better: awk '$1=="state"{print l,$2;next}{l=$0}' Commented Jan 9, 2018 at 0:05
  • @GeorgeVasiliou Besides, the answer as it stands now seems more robust. Commented Jan 9, 2018 at 0:11
0

Got the result by using below sed one liner

 sed "s/state//g" filename| sed "N;s/\n/ /g"

output

HDFS  worker01.gtdns.com   STARTED
HDFS  worker02.gtdns.com   STOP
HDFS  worker03.gtdns.com   STARTED
HDFS  worker05.gtdns.com   STARTED
HDFS  worker06.gtdns.com   STARTED
HDFS  worker07.gtdns.com   STARTED
HDFS  worker08.gtdns.com   STARTED
HDFS  worker09.gtdns.com   STOP

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.