0

Hey I've got a working script but it work on k-combinations so it work long... I want to parallelize a for loop to divide the work time.

Here is the simplified code:

fin2 = open('combi_nod.txt','r')
for lines in fin2:
    (i, j) = eval(lines)
    edgefile = open('edge.adjlist', 'a')
    count = 0
    for element in intersection(
            eval(linecache.getline('triangleset.txt', i+1)),
            eval(linecache.getline('triangleset.txt', j+1))):
        if element not in merge:
           count = 1
           break
    if count == 0:
        edgefile.write(' ' + str(j))
    edgefile.close()
fin2.close()

How can I do this?

EDIT

After some modification I have accomplished the multiprocessing loop. But their is a associate issue:

in my initial for loop I search in the combi_nod.txt file; combi_nod.txt content is the itertools.combinaison of large number. (so, at a point I can anymore store them in variable)

My multiprocessing loop work with a list of this itertools.combinaison because I haven't see a way to pass line of a file in arguments (so I have a memory issue), have you a new Idea?

EDIT2

For clarification, here is the code like it is a this point:

def intersterer(lines):
  (i, j) = lines
  counttt = 0
  for element in some_stuff:
    if element not in merge:
      counttt = 1
      break
  if counttt == 0:
     return (int(i), int(j))
  else:
     return (0, 0)

fin2 = open('combi_nod.txt','w')
for trian_c in itertools.combinations(xrange(0, counter_tri), 2):
#counter_tri is a large number
    fin2.write(str(trian_c) + "\n")
fin2.close()
fin2 = open('combi_nod.txt','r')

if __name__ == '__main__':
    pool = Pool() 
    listt = pool.map(intersterer, itertools.combinations(xrange(0, counter_tri), 2))  
    f2(listt)
    if (0,0) in listt: listt.remove((0,0))

and I want to have something working like:

listt = pool.map(intersterer, fin2) 

But all my tests doesn't work at all... Help...

16
  • It's hard to tell from the simplified code, but where is the process spending most of its time, computing the eval()s or reading/writing the files? Commented Jan 14, 2015 at 20:58
  • All my variables are written in files for minimize RAM usage. My script generate a lot of data so I can't store all of them in memory. Commented Jan 14, 2015 at 21:23
  • The only process I want parallelize is the first for loop. I'm thinking about doing eight lines at once. Commented Jan 14, 2015 at 21:27
  • That statement's a bit confusing since the second for loop is nested inside the first. Writing everything to files may complicate the matter if you need simultaneous write access to the same one from concurrent processes or threads. In concurrent programming access to any shared resource has to be controlled by locks or semaphores or something similar. Commented Jan 14, 2015 at 23:09
  • I can split the initial file and merge at final step. Thks to talk me about this upcoming issue. Like I said it's the first loop that I would parallelize. Commented Jan 15, 2015 at 7:03

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.