4

I have multiple files and I want to read them simultaneously, extract a number from each row and do the averages. For a small number of files I did this using izip in the itertools module. Here is my code.

from itertools import izip
import math

g=open("MSDpara_ave_nvt.dat",'w')

with open("sample1/err_msdCECfortran_nvt.dat",'r') as f1, \
     open("sample2/err_msdCECfortran_nvt.dat",'r') as f2, \
     open("sample3/err_msdCECfortran_nvt.dat",'r') as f3, \
     open("err_msdCECfortran_nvt.dat",'r') as f4:

     for x,y,z,bg in izip(f1,f2,f3,f4):
         args1=x.split()
         i1 = float(args1[0])
         msd1 = float(args1[1])


         args2=y.split()
         i2 = float(args2[0])
         msd2 = float(args2[1])


         args3=z.split()
         i3 = float(args3[0])
         msd3 = float(args3[1])

         args4=bg.split()
         i4 = float(args4[0])
         msd4 = float(args4[1])


         msdave = (msd1 + msd2 + msd3 + msd4)/4.0

         print>>g, "%e  %e" %(i1, msdave)

 f1.close()
 f2.close()
 f3.close()
 f4.close()
 g.close()

This code works OK. But if I want to handle 100 files simultaneously, the code becomes very lengthy if I do it in this way. Are there any other simpler ways of doing this? It seems that fileinput module can also handle multiple files, but I don't know if it can do it simultaneously.

Thanks.

1
  • 1
    You don't need to explicitly close files opened in a with statement. Commented Jun 8, 2014 at 17:32

1 Answer 1

11

The with open pattern is good, but in this case it gets in your way. You can open a list of files, then use that list inside izip:

filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
files = [open(i, "r") for i in filenames]
for rows in izip(*files):
    # rows is now a tuple containing one row from each file

In Python 3.3+ you can also use ExitStack in a with block:

filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
with ExitStack() as stack:
    files = [stack.enter_context(open(i, "r")) for i in filenames]
    for rows in zip(*files):
        # rows is now a tuple containing one row from each file

In Python < 3.3, to use with with all its advantages (e.g. timely closing no matter how you exit the block), you would need to create your own context manager:

class FileListReader(object):

    def init(self, filenames):
        self.files = [open(i, "r") for i in filenames]

    def __enter__(self):
        for i in files:
            i.__enter__()
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        for i in files:
            i.__exit__(exc_type, exc_value, traceback)

Then you could do:

filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
with FileListReader(filenames) as f:
    for rows in izip(*f.files):
        #...

In this case the last might be considered over-engineering, though.

Sign up to request clarification or add additional context in comments.

5 Comments

Instead of creating a new one, the OP could upgrade to modern Python and use an ExitStack instead.
@DSM, thanks for the link. I didn't know about that one (I use 2.7). That's certainly less code when used only once. I'll integrate it to the answer.
Thanks a lot, @otus. That's very helpful. So if I do 'files = [open(i, "r") for i in filenames] for rows in izip(files):' as you said, how can I read lines from each tuple "rows"? Apparently I cannot use readline().
@otus, it seems that the tuple 'rows' is not a tuple of strings. If I print the content of tuple 'rows', I only got something like "<open file 'err_msdCECfortran_new.dat', mode 'r' at 0x7ff7182d16f0>". And if I further look at the dimension of tuple 'rows' using 'len(rows)', it shows the dimension of 'rows' is one. I'm a bit confused that why this tuple 'rows' does not contain a row of string in my data file as you've mentioned.
@user2226358, sorry, I forgot the star * inside zip. Answer updated. (It passes the list as multiple arguments instead of one, so that zip will indeed zip them.)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.