5

I have 125 data files containing two columns and 21 rows of data. Please see the image below:

enter image description here

and I'd like to import them into a single .csv file (as 250 columns and 21 rows).

I am fairly new to python but this what I have been advised, code wise:

import glob
Results = [open(f) for f in glob.glob("*.data")] 
fout = open("res.csv", 'w')

for row in range(21):
 for f in Results:
  fout.write( f.readline().strip() ) 
  fout.write(',')
 fout.write('\n')
fout.close()

However, there is slight problem with the code as I only get 125 columns, (i.e. the force and displacement columns are written in one column) Please refer to the image below: enter image description here

enter image description here

I'd very much appreciate it if anyone could help me with this !

6
  • Can you show the output opened with an editor, not excel? In Excel it is not clearly seen which characters are used as seperators. Also an advanced view with visualized whitespaces would help (like possible with advanced editors like say notepad++) Commented Apr 23, 2012 at 13:00
  • is this a one time thing? if you are on posix just concat the files? Commented Apr 23, 2012 at 13:00
  • Hi, I just added the output image in the text editor. Commented Apr 23, 2012 at 13:14
  • It doesn't look like the code you've posted could have produced the output you've shown in your last update. Where does the comma come from? At the very least, you've got an indentation error at your first fout.write('\n') line. Commented Apr 23, 2012 at 13:18
  • See my solution in your [previous question][1]. [1]: stackoverflow.com/questions/10273640 Commented Apr 23, 2012 at 14:41

4 Answers 4

8
import glob
results = [open(f) for f in glob.glob("*.data")]
sep = ","
# Uncomment if your Excel formats decimal numbers like 3,14 instead of 3.14
# sep = ";"

with open("res.csv", 'w') as fout:
    for row in range(21):
        iterator = (f.readline().strip().replace("\t", sep) for f in results)
        line = sep.join(iterator)
        fout.write("{0}\n".format(line))

So to explain what went wrong with your code, your source files use tab as a field separator, but your code uses comma to separate the lines it reads from those files. If your excel uses period as a decimal separator, it uses comma as a default field separator. The whitespace is ignored unless enclosed in quotes, and you see the result.

If you use the text import feature of Excel (Data ribbon => From Text) you can ask it to consider both comma and tab as valid field separators, and then I'm pretty sure your original output would work too.

In contrast, the above code should produce a file that will open correctly when double clicked.

Sign up to request clarification or add additional context in comments.

6 Comments

Hi, I tried your code but wrote everything in a row as a pose to column.
I see, I was wondering if you could kindly edit your code above as I am slightly confused. Many Thanks in advance.
Lazyr, Thanks alot for this, but the same problem still exist.
Apologies, I forgot to mention it, it is tab separated
Fantastic, I imported the original output as you instructed and got it all in order. Thanks a Billion !
|
2

You don't need to write your own program to do this, in python or otherwise. You can use an existing unix command (if you are in that environment):

paste *.data > res.csv

2 Comments

...which means he's on windows. Ok. Unless it's installed already just download cygwin and stop mucking around.
Also cygwin can be used. Possibly -d delimiter should be used to join the lines correctly. On the other hand, the Python code may be more suitable if it is embeded in another Python code or if it should be modified some way later.
1

Try this:

import glob, csv
from itertools import cycle, islice, count

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

Results = [open(f).readlines() for f in glob.glob("*.data")] 
fout = csv.writer(open("res.csv", 'wb'), dialect="excel")

row = []
for line, c in zip(roundrobin(Results), cycle(range(len(Results)))):
    splitline = line.split()
    for item,currItem in zip(splitline, count(1)):
        row[c+currItem] = item
    if count == len(Results):
        fout.writerow(row)
        row = []
del fout

It should loop over each line of your input file and stitch them together as one row, which the csv library will write in the listed dialect.

Comments

1

I suggest to get used to csv module. The reason is that if the data is not that simple (simple strings in headings, and then numbers only) it is difficult to implement everything again. Try the following:

import csv
import glob
import os

datapath = './data'
resultpath = './result'
if not os.path.isdir(resultpath):
   os.makedirs(resultpath)

# Initialize the empty rows. It does not check how many rows are
# in the file.
rows = []

# Read data from the files to the above matrix.
for fname in glob.glob(os.path.join(datapath, '*.data')):
    with open(fname, 'rb') as f:
        reader = csv.reader(f)
        for n, row in enumerate(reader):
            if len(rows) < n+1:
                rows.append([])  # add another row
            rows[n].extend(row)  # append the elements from the file

# Write the data from memory to the result file.
fname = os.path.join(resultpath, 'result.csv')
with open(fname, 'wb') as f:
    writer = csv.writer(f)
    for row in rows:
        writer.writerow(row)

The with construct for opening a file can be replaced by the couple:

f = open(fname, 'wb')
...
f.close()

The csv.reader and csv.writer are simply wrappers that parse or compose the line of the file. The doc says that they require to open the file in the binary mode.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.