Time to use callback function
Your situation is the case where use of callback function can help.
Typically, callback is a function with agreed parameters and sometime also return values. The
callback function is passed as an argument to another function, which is calling it passing agreed
arguments to it and leaving processing to the callback function.
To make your code working, I had to modify it a bit. All the code comes to one file e.g. with name
"et.py"
To explain it, I will show it piece by piece.
Imports
import os
from glob import glob
Callback for processing content read from the file
Your example was reading values into content variable, each loop rewriting it with new value, so
finally you would have only last value there.
I modified the code by adding global variable GLOB_CONTENT, to which I was appending the
content of each file one by one.
GLOB_CONTENT = []
def read_file_content(path):
global GLOB_CONTENT
with open(path) as f:
# get file content
content = f.read()
# do some content processing here
GLOB_CONTENT.append(content)
Usage of global variables is sometime suspicious, but it is one way of keeping global state of
something.
Callback for counting lines - with "memory"
Any function shall be usable as callback (if it follows expected signature). And one case is a
method of class instance. It will be derived from dict to be able remembering some values under
key name, and it will add a method count_file_lines, taking as argument name of a file:
class FilesLineCounter(dict):
def count_file_lines(self, path):
with open(path) as file:
self[path] = sum(1 for line in file if line.strip())
It counts non-empty lines in the file and remember it in itself.
Function processing the files
The loop can be generalized into function:
def process_ids(dir_path, ids_list, file_extension, callback):
for itm_id in ids_list:
id_dir = os.path.join(dir_path, itm_id)
for path in glob(id_dir + '/*' + file_extension):
callback(path)
As you see, it gets all the arguments necessary to find proper files, plus callback function used
to process the found file.
Finally: put it all together
Here is final part of the code:
if __name__ == "__main__":
dir_path = "subdir"
ids_list = ["1", "2"]
file_extension = ".txt"
cntr = FilesLineCounter()
# goint to use the callback magic
process_ids(dir_path, ids_list, file_extension, cntr.count_file_lines)
process_ids(dir_path, ids_list, file_extension, read_file_content)
# time to show our results
for path, numoflines in cntr.items():
print("File {} has {} lines".format(path, numoflines))
for i, content in enumerate(GLOB_CONTENT):
print("File # {} last 3 bytes are {}".format(i, content[-3:]))
The cntr = FilesLineCounter() creates our special sort of extended dictionary. The cntr is empty
dictionary with added method count_file_lines. As the method is usable as a function, we use
cntr.count_file_lines as value for callback.
When it is processed by process_ids, we shall find in cntr one key per processed file and each
having value with number of non-empty lines in that file.
Similarly we read the content.
Running the $ python et.py I get following output:
File subdir/1/one-plus.txt has 1 lines
File subdir/2/empty.txt has 0 lines
File subdir/1/one.txt has 8 lines
File subdir/2/long.txt has 42 lines
File # 0 last 3 bytes are fa
File # 1 last 3 bytes are hi
File # 2 last 3 bytes are fa
File # 3 last 3 bytes are
my_function(ids_list, file_extension, lambda: len(ids_list)). The lambda could then be called like a function inside your function definition.exec, no.