Python- Read from Multiple Files

Question

I have 125 data files containing two columns and 21 rows of data. Please see the image below:

enter image description here

and I'd like to import them into a single .csv file (as 250 columns and 21 rows).

I am fairly new to python but this what I have been advised, code wise:

import glob
Results = [open(f) for f in glob.glob("*.data")] 
fout = open("res.csv", 'w')

for row in range(21):
 for f in Results:
  fout.write( f.readline().strip() ) 
  fout.write(',')
 fout.write('\n')
fout.close()

However, there is slight problem with the code as I only get 125 columns, (i.e. the force and displacement columns are written in one column) Please refer to the image below: enter image description here

enter image description here

I'd very much appreciate it if anyone could help me with this !

Can you show the output opened with an editor, not excel? In Excel it is not clearly seen which characters are used as seperators. Also an advanced view with visualized whitespaces would help (like possible with advanced editors like say notepad++) — Argeman
– Argeman, Commented Apr 23, 2012 at 13:00
is this a one time thing? if you are on posix just concat the files? — dm03514
– dm03514, Commented Apr 23, 2012 at 13:00
It doesn't look like the code you've posted could have produced the output you've shown in your last update. Where does the comma come from? At the very least, you've got an indentation error at your first fout.write('\n') line. — Lauritz V. Thaulow
– Lauritz V. Thaulow, Commented Apr 23, 2012 at 13:18
See my solution in your [previous question][1]. [1]: stackoverflow.com/questions/10273640 — pepr
– pepr, Commented Apr 23, 2012 at 14:41

Lauritz V. Thaulow · Accepted Answer · 2013-12-19 20:26:00Z

8

import glob
results = [open(f) for f in glob.glob("*.data")]
sep = ","
# Uncomment if your Excel formats decimal numbers like 3,14 instead of 3.14
# sep = ";"

with open("res.csv", 'w') as fout:
    for row in range(21):
        iterator = (f.readline().strip().replace("\t", sep) for f in results)
        line = sep.join(iterator)
        fout.write("{0}\n".format(line))

So to explain what went wrong with your code, your source files use tab as a field separator, but your code uses comma to separate the lines it reads from those files. If your excel uses period as a decimal separator, it uses comma as a default field separator. The whitespace is ignored unless enclosed in quotes, and you see the result.

If you use the text import feature of Excel (Data ribbon => From Text) you can ask it to consider both comma and tab as valid field separators, and then I'm pretty sure your original output would work too.

In contrast, the above code should produce a file that will open correctly when double clicked.

edited Dec 19, 2013 at 20:26

answered Apr 23, 2012 at 13:06

Lauritz V. Thaulow

51.3k13 gold badges76 silver badges94 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Esan Over a year ago

Hi, I tried your code but wrote everything in a row as a pose to column.

Esan Over a year ago

I see, I was wondering if you could kindly edit your code above as I am slightly confused. Many Thanks in advance.

Esan Over a year ago

Lazyr, Thanks alot for this, but the same problem still exist.

Esan Over a year ago

Apologies, I forgot to mention it, it is tab separated

Esan Over a year ago

Fantastic, I imported the original output as you instructed and got it all in order. Thanks a Billion !

|

Benedict · Accepted Answer · 2012-04-23 14:11:05Z

2

You don't need to write your own program to do this, in python or otherwise. You can use an existing unix command (if you are in that environment):

paste *.data > res.csv

answered Apr 23, 2012 at 14:11

Benedict

2,8411 gold badge23 silver badges21 bronze badges

2 Comments

Benedict Over a year ago

...which means he's on windows. Ok. Unless it's installed already just download cygwin and stop mucking around.

pepr Over a year ago

Also cygwin can be used. Possibly -d delimiter should be used to join the lines correctly. On the other hand, the Python code may be more suitable if it is embeded in another Python code or if it should be modified some way later.

Spencer Rathbun · Accepted Answer · 2012-04-23 14:11:14Z

Try this:

import glob, csv
from itertools import cycle, islice, count

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

Results = [open(f).readlines() for f in glob.glob("*.data")] 
fout = csv.writer(open("res.csv", 'wb'), dialect="excel")

row = []
for line, c in zip(roundrobin(Results), cycle(range(len(Results)))):
    splitline = line.split()
    for item,currItem in zip(splitline, count(1)):
        row[c+currItem] = item
    if count == len(Results):
        fout.writerow(row)
        row = []
del fout

It should loop over each line of your input file and stitch them together as one row, which the csv library will write in the listed dialect.

pepr · Accepted Answer · 2012-04-23 15:37:19Z

I suggest to get used to csv module. The reason is that if the data is not that simple (simple strings in headings, and then numbers only) it is difficult to implement everything again. Try the following:

import csv
import glob
import os

datapath = './data'
resultpath = './result'
if not os.path.isdir(resultpath):
   os.makedirs(resultpath)

# Initialize the empty rows. It does not check how many rows are
# in the file.
rows = []

# Read data from the files to the above matrix.
for fname in glob.glob(os.path.join(datapath, '*.data')):
    with open(fname, 'rb') as f:
        reader = csv.reader(f)
        for n, row in enumerate(reader):
            if len(rows) < n+1:
                rows.append([])  # add another row
            rows[n].extend(row)  # append the elements from the file

# Write the data from memory to the result file.
fname = os.path.join(resultpath, 'result.csv')
with open(fname, 'wb') as f:
    writer = csv.writer(f)
    for row in rows:
        writer.writerow(row)

The with construct for opening a file can be replaced by the couple:

f = open(fname, 'wb')
...
f.close()

The csv.reader and csv.writer are simply wrappers that parse or compose the line of the file. The doc says that they require to open the file in the binary mode.

Collectives™ on Stack Overflow

Python- Read from Multiple Files

4 Answers 4

6 Comments

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

6 Comments

2 Comments

Comments

Comments

Linked

Related