Return to Revisions

3 of 4

format in the body

edit approved May 17, 2017 at 14:27

7.1k
30
103
201

I make extensive use of the Perl language module Text::CSV_XS for heavy duty, ad-hoc manipulation of CSV files. Using this module I've built four small basic Perl programs to use as building blocks for whatever I want to do.

Filter - Filter inputFile filterFile field
Reject - Reject inputFile filterFile field
Stripper - Stripper inputFile field [field2 field3…]
Swap - Swap inputFile swapFile matchField outfield

The filterFile has a regEx pattern on each line. Anything matching one of these patterns is matched for the purposes of acceptance or rejection. The assorted "field"s are column header names.

So in your example I just put "1" in filterFile and go:

perl Filter.pm data.csv filter.txt data_id >One.csv
perl Stripper.pm One.csv data_id event_value >Two.csv
perl Swap.pm Two.csv log.csv data_id name >Three.csv

If we also wanted Leopold's events filter.txt would have two lines with eponymous contents:

1
2

I have assorted mutant versions of all four building block routines that do things like take input from STDIN or or post output to a specific URL.

answered May 14, 2017 at 21:44

Nadreck