how to capture string from csv line that comes after specific word

Question

for example , this is the csv line that we want to cut the strings that comes after /data/

status=true /data/sdb/hadoop/hdfs/log,/data/sdc/hadoop/hdfs/log,/data/sdd/hadoop/hdfs/log,/data/sde/hadoop/hdfs/log,/data/sdf/hadoop/hdfs/log

example of expected resuls

sdb
sdc
sdd
sde
sdf

Just for completeness: is the status=true part not separated by a ,? — AdminBee
– AdminBee, Commented Mar 3, 2020 at 11:29
What should the output be if /data/sdb/foo/data/bar existed in the CSV? What if a field was /foo/bar/data/ (i.e. nothing after /data/)? — Ed Morton
– Ed Morton, Commented Mar 3, 2020 at 15:57

pLumo · Accepted Answer · 2020-03-03 11:30:17Z

4

Use grep:

with PCRE:

grep -Po '/data/\K[^/]*'

if that is not available:

grep -o '/data/[^/]*' | cut -d'/' -f3

answered Mar 3, 2020 at 11:30

pLumo

23.2k2 gold badges43 silver badges70 bronze badges

Add a comment |

Jake Ireland · Accepted Answer · 2020-03-03 11:55:58Z

1

@pLumo absolutely has the right answer. If, for whatever reason, you wanted to use awk and bash's builtin parameter expansion, all the while being slightly convoluted...

LINE_COUNTER=0
while read line; do
    COUNT_SEP="${line//[^,]}"
    for col in $(seq 2 $((${#COUNT_SEP}+1))); do
        LINE_COUNTER=$(($LINE_COUNTER+1))
        COLUMN=$(echo "${line}" | awk -v variable="${col}" -F, '{ print $variable }')
        if [ $LINE_COUNTER -eq 1 ]
        then
            echo "${COLUMN}" > /tmp/splitCSV
        else
            echo "${COLUMN}" >> /tmp/splitCSV
        fi
    done
    while read splitCol; do
        echo "${splitCol}" | awk -F'/data/' '{ print $2 }' | awk -F'/' '{ print $1 }'
    done < /tmp/splitCSV
done < test.csv

answered Mar 3, 2020 at 11:55

Jake Ireland

2252 silver badges8 bronze badges

1

You should never do that. See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons.

Ed Morton
– Ed Morton

2020-03-03 15:59:28 +00:00
Commented Mar 3, 2020 at 15:59
1

Thanks! I didn't know that was best practice. Very interesting.

Jake Ireland
– Jake Ireland

2020-03-03 18:51:39 +00:00
Commented Mar 3, 2020 at 18:51
1

Yeah, the guys who invented shell to manipulate files and processes also invented tools like awk for shell to call to manipulate text. So, horses for courses... never write a shell loop just to manipulate text and you can't go wrong.

Ed Morton
– Ed Morton

2020-03-04 14:41:54 +00:00
Commented Mar 4, 2020 at 14:41

Add a comment |

schrodingerscatcuriosity · Accepted Answer · 2020-03-03 12:07:53Z

1

Just to add an option, having in mind that there's only one pattern that match three characters between slashes, with sed and grep:

grep -o "/.../"  foo | sed 's;/;;g' file

Output:

sdb
sdc
sdd
sde
sdf

answered Mar 3, 2020 at 12:07

schrodingerscatcuriosity

12.8k5 gold badges38 silver badges64 bronze badges

Add a comment |

Praveen Kumar BS · Accepted Answer · 2020-03-03 13:02:41Z

1

For Above input below command will work

perl -pne "s/,/\n/g"  filename|awk -F '/data/' '{gsub("/.*","",$2);print $2}'

output

sdb
sdc
sdd
sde
sdf

answered Mar 3, 2020 at 13:02

Praveen Kumar BS

5,3112 gold badges12 silver badges16 bronze badges

Add a comment |

Clement · Accepted Answer · 2020-03-03 14:22:05Z

1

This works for me with awk

awk -F'/' '{for(i=1;i<=NF;i++) if($i=="data") print $(i+1)}' <file>

1: -F defines field separator as /

2: loop on every field on each line

3: if field equals "data" print next field

answered Mar 3, 2020 at 14:22

Clement

574 bronze badges

Add a comment |

Rakesh Sharma · Accepted Answer · 2020-03-03 17:06:46Z

1

We can choose from the following :

awk -F/ '
     BEGIN { OFS = RS }
     {
       N = split($0, a, /\//)
       $0 = "" 
        for ( i=j=1; i<N; i++ ) 
            if ( a[i] == "data" ) 
                 $(j++) = a[++i]
      }N>1' file.csv


perl -F/ -lane '
   shift(@F) eq q(data) and print(shift(@F)) 
      while(@F && m{/data/});
' file.csv


perl -lne 'print for m{/data/([^/,]+)}g' file.csv


sed -re '
    /\n/{P;D;}
    s:/data/([^/,]+):\n\1\n:
   D
' file.csv

answered Mar 3, 2020 at 17:06

Rakesh Sharma

8561 gold badge5 silver badges3 bronze badges

Add a comment |

Stack Exchange Network

how to capture string from csv line that comes after specific word

6 Answers 6

You must log in to answer this question.

Linked

Hot Network Questions

how to capture string from csv line that comes after specific word

6 Answers 6

You must log in to answer this question.

Linked

Related

Hot Network Questions