I'm trying to compare two csv files and then print the product_id field on one line from each csv file. This is the code in question. Something to note is that in the two csv files, the fields are not in the same order.
import csv
import sys
f1 = sys.argv[1]
f2 = sys.argv[2]
num_matches = 0
with open(f1, 'rb') as f:
csv_readerf = csv.reader(f)
csv_readerf.next()
with open(f2, 'rb') as n:
csv_readern = csv.reader(n)
csv_readern.next()
for row in csv_readerf:
a_name = row[0].replace(" ", "").lower() #not used, can be ommitted
a_id = row[1]
a_post = row[2]
a_rev = row[3]
a_loc = row[4] #not used, can be ommitted
a_desc = row[5].replace(" ", "").lower() #remove all whitespaces for uniformity
a_ovr = row[6]
a_cmf = row[7]
a_sty = row[8]
a_siz = row[9]
a_arc = row[10]
a_wid = row[11]
a_url = row[12]
for rowP in csv_readern:
p_name = rowP[10].replace(" ", "").lower()
p_id = rowP[6]
temp = rowP[11].split(" ")[0:3] #disregard time stamp
p_post = (" ").join(temp)
p_rev = rowP[7]
if p_rev is "":
p_rev = "Anonymous"
p_desc = rowP[1].replace(" ", "").replace("\n", "").replace("\r\n", "").lower()
p_ovr = rowP[4]
p_cmf = rowP[3]
p_sty = rowP[0]
p_siz = rowP[8]
p_arc = rowP[9]
if p_arc:
p_arc = p_arc[0 : p_arc.index(" ")] #for arch we only want the first word
p_wid = rowP[5]
p_url = rowP[2]
print a_id, p_id
The problem I am having is that in the output, which I dumped into a .txt file, not all the product_id's from f1 are printed. I know this for sure, because f1 is a test file I created, and I purposely placed several products of different id's in there.
Another things to note is that I tried looping through each csv in separate scripts, and each worked correctly, printing out each product_id as expected. Why is it that when I embed the for loops, iterating through the first file seems to be cut short? What could be the problem? The test files that I made are small, so they should be able to fit in memory perfectly fine.
forloop inside the first? This could be the source of your problem, as after this loop runs once, it will read to the end of the file, so on subsequent runs the inner loop will not run.