0

I have thousands of files inside a directory with this pattern YYYY/MM/DD/HH/MM:

  • 201801010000.txt
  • 201801010001.txt
  • 201801010002.txt

I want to keep just the hours, so I need to merge 60 files into one for every hour of every day. I don't know how to search into the filename to get the 60 files that i want. This is what I wrote

def concat_files(path):
    file_list = os.listdir(path)
    with open(datetime.datetime.now(), "w") as outfile:
        for filename in sorted(file_list):
            with open(filename, "r") as infile:
                outfile.write(infile.read())

How do I name the file to keep the date? I'm using datetime now but it override the current filename. With my code I'm merging all files into one, I should merge every % 60 into a different file.

3
  • Possible duplicate of Merge CSV Files in Python with Different file names Commented May 16, 2018 at 12:10
  • If the filename is already in the format YYMMDDHHMM , can't you just remove the last two characters before the .txt extension ? Commented May 16, 2018 at 12:11
  • IMO a combination of groupby and datetime.strptime will solve this easily. Can you elaborate on input and output? Commented May 16, 2018 at 12:11

3 Answers 3

1

You were not that far, you just need to swap your logic:

file_list = os.listdir(path)
for filename in sorted(file_list):
    out_filename = filename[:-6] + '.txt'
    with open(out_filename, 'a') as outfile:
        with open(path + '/' + filename, 'r') as infile:
            outfile.write(infile.read())
Sign up to request clarification or add additional context in comments.

Comments

1

You can use glob to get just files you want. It lets you pass in a pattern to match against when searching for files. In the last line below, it will only find files that begin with '2018010100', have two characters, and end with '.txt'

from glob import glob

def concat_files(dir_path, file_pattern):
    file_list = glob(os.path.join(dir_path, file_pattern))
    with open(datetime.datetime.now(), "w") as outfile:
        for filename in sorted(file_list):
            with open(filename, "r") as infile:
                outfile.write(infile.read())

concat_files('C:/path/to/directory', '2018010100??.txt')

Comments

0

Try this one.

file_list = os.listdir(path)
for f in { f[:-6] for f in file_list }:
    if not f:
        continue
    with open(f + '.txt', 'a') as outfile:
        for file in sorted([ s for s in file_list if s.startswith(f)]):
            with open(path + '/' + file, 'r') as infile:
                outfile.write(infile.read())
            #os.remove(path + '/' + file) # optional

1 Comment

Welcome to Stack Overflow! While it's great to answer questions and we welcome it, it is also necessary to explain what did your code do as a solution. Add the relevant explanation to your answer. From Review

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.