I'm working with large set of csv data and I want to put several columns in different places into one column separated by semi-colon(;).
So what I have now is..
a b c d
1 2 3 4
1 2 3 4
1 2 3 4
I want to change this like..this, So all my data is only in column d.
a b c d
a=1;b=2;c=3;d=4;
a=1;b=2;c=3;d=4;
a=1;b=2;c=3;d=4;
I know how to delete those empty column a,b and c but I just can't figure out a way to merge the data from column a,b,c into column d. Thanks in advance.
The code that I have so far is..
# Parsing the custom formatted data with csv module.
# reads the custom format input and spits out the output in VCF format.
import csv
# input and output
with open('1-0002', 'rb') as csvin, open('converted1','wb') as csvout:
# reading and writing are all tab delimited
reader = csv.reader(csvin, delimiter = '\t')
writer = csv.writer(csvout, delimiter = '\t')
# add headings before the for loop to prevent the heading being affected by column manipulation.
writer.writerow(["#CHROM","POS","ID","REF","ALT","QUAL","FILTER","INFO"])
for row in reader:
# deleting unnecessary columns, 'del' operator must be in ascending order or else it will give range error
# manually deleting columns since the input data is in custom format.
del row[11]
del row[10]
del row[9]
del row[8]
del row[7]
del row[6]
del row[5]
del row[1]
del row[0]
# inserting 1 and . in specific columns
row.insert(0,'1')
row.insert(2,'.')
row.insert(5,'.')
row.insert(7,'') # inserting empty column for INFO headings.
# change 'YES' to 'PASS' , leaving HETERO as it is.
if row[6] == 'YES':
row[6] = 'PASS'
writer.writerow(row)
So from this code above, I want to put the data from several different columns into INFO column.
d, or can the new merged column be called something else, eg.d_merged?row = row[12:]?