Find all files, create CSV with one row per subdirectory and file names in collumns

Question

I have a directory with subdirectories and files structured like this:

01/fileA
01/fileB
01/fileC
02/fileD
02/fileE
03/fileF
03/fileG
03/fileH
04/fileI

I'd like to get a CSV that looks like this:

01, fileA, fileB, fileC
02, fileD, fileE
03, fileF, fileG, fileH
04, fileI

In other words, I want to generate a CSV with one row per subdirectory, with files listed as columns.

Is it possible to do this from the Linux command line?

When you say Bash, you mean bash and only bash? No sed, awk, find or other standard utils? — AlwaysLearning
– AlwaysLearning, Commented Nov 17, 2016 at 21:59
Any standard utils are okay, just edited question to clarify. — Nathan
– Nathan, Commented Nov 17, 2016 at 22:02
FYI, standard CSVs should not have a space and a comma as a delimiter—just a comma. It's less human readable, but more computer friendly. — Wildcard
– Wildcard, Commented Nov 18, 2016 at 1:53

Göran Uddeborg · Accepted Answer · 2016-11-17 22:19:35Z

2

That can be done in a number of ways. One simple method could be this

for d in *
do  echo -n "$d, "
    ls -m $d
done

answered Nov 17, 2016 at 22:19

Göran Uddeborg

6383 silver badges9 bronze badges

Thank you, this helps a lot. But, I'm getting a line feed before the last file sometimes. 01, fileA, fileB, fileC [LF] 02, fileD [LF] fileE [LF] 03, FileF [LF] and so on. There are no commas in file name either.

Nathan
– Nathan

2016-11-17 22:40:36 +00:00
Commented Nov 17, 2016 at 22:40
That sounds strange. It looks as if the ls for directory 02 was run without -m. Do you have an actual example? A file tree that causes this?

Göran Uddeborg
– Göran Uddeborg

2016-11-17 23:02:53 +00:00
Commented Nov 17, 2016 at 23:02
1

Use ls -w0 -m "$d" to avoid hitting the default line length too early

Chris Davies
– Chris Davies

2016-11-17 23:20:23 +00:00
Commented Nov 17, 2016 at 23:20
Ah, good point!

Göran Uddeborg
– Göran Uddeborg

2016-11-18 18:56:57 +00:00
Commented Nov 18, 2016 at 18:56
In my version OS ls -w0 caused an error, but I was able to use ls -w99999999. Thanks!

Nathan
– Nathan

2016-11-18 22:00:44 +00:00
Commented Nov 18, 2016 at 22:00

Add a comment |

steeldriver · Accepted Answer · 2016-11-18 01:41:06Z

It's probably overkill, but using GNU datamash

find 0? -type f | sort -t/ | datamash -t\/ groupby 1 collapse 2 | sed 's/\//,/'
01,fileA,fileB,fileC
02,fileD,fileE
03,fileF,fileG,fileH
04,fileI

Or with a perl hash of arrays

find 0? -type f | perl -F/ -alne '
  push @{$dirs{$F[0]}}, $F[1]; 
  END{
    for $d (sort keys %dirs) {print join ",", $d, sort @{$dirs{$d}}}
  }'
01,fileA,fileB,fileC
02,fileD,fileE
03,fileF,fileG,fileH
04,fileI

or with GNU awk

find 0? -type f | sort -t/ | gawk -F/ '
  {dirs[$1] = dirs[$1] "," $2} 
  END {
    n = asorti(dirs,sdirs); 
    for(i=1;i<=n;i++) print sdirs[i] "" dirs[sdirs[i]]
}'
01,fileA,fileB,fileC
02,fileD,fileE
03,fileF,fileG,fileH
04,fileI

With GNU awk > 4.0 you can simplify the array traversal to

  END {
    PROCINFO["sorted_in"] = "@ind_num_asc";
    for (d in dirs) print d "" dirs[d];
  }'

Chris Davies · Accepted Answer · 2016-11-17 23:19:25Z

0

Here is another solution

find * -type d -printf "\n%p, " -exec ls -w0 -m {} \; |
    sed -e '/^$/d' -e 's/, *$//'

Output

01, fileA, fileB, fileC
02, fileD, fileE
03, fileF, fileG, fileH
04, fileI

answered Nov 17, 2016 at 23:19

Chris Davies

128k16 gold badges178 silver badges323 bronze badges

Add a comment |

Stack Exchange Network

Find all files, create CSV with one row per subdirectory and file names in collumns

3 Answers 3

You must log in to answer this question.

Hot Network Questions

Find all files, create CSV with one row per subdirectory and file names in collumns

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions