2

I have a text file in which each row has multiple words (which I want to consider as columns). Now I want to read all the data from this text file and create a csv file with rows and columns. I am written the code till here -

import csv
f=open("text.txt", "r")
reader=csv.reader(f)
offile=open("output.csv","wb")
writer=csv.writer(offile,delimiter='\t',quotechar='"',quoting=csv.QUOTE_ALL)
for row in reader:
 ........

f.close()
offile.close()

I am not able to understand how to divide each row into columns and write this columns and rows back while writing a csv file? I am a newbie to python, so a good example I will be very greatful.

Thanks

1
  • please post a linkt to test.txt if you want someone to be able to give you more than passing help Commented Apr 1, 2014 at 19:09

3 Answers 3

1

Try splitting the lines via a regular expression:

line = "Foo bar baz quux"
import re
pieces = re.split("\s+", line)
print pieces

This results in

['Foo', 'bar', 'baz', 'quux']

The regular expression used above matches for multiple (+) white space characters (\s)

Sign up to request clarification or add additional context in comments.

9 Comments

Do you know that split() without argument is equivalent to split("\s+") but faster?
why would you do this instead of line.split() ??
Question author didn't specify much detail on how they wanted to split, other than "divide rows into columns". Yeah, there's a bunch of different ways you can split by whitespace. I did it with the re module specifically to draw attention to the use of a regular expression. That comes with it flexibility to split in any number of different ways. The beginner nature of the question led me to believe that we couldn't take it for granted he/she knows what regular expressions are.
The input text file has random different types of data in rows like - data1 data2 data3 data4 data5 data6 data7 All I want is to consider all the 'data' as columns in a single row and write it as a single row into a csv file. Hope this helps more
@user3486471 are you saying it has strange delimiters that separate fields? Do you know what they are? Is this columnar data where say the first 10 columns is one field? For follow up questions, you might want to pastebin a sample of the data you're trying to process.
|
0
import re
data = open("test.txt").read()
lines_of_data = data.splitlines()
writer=csv.writer(offile,delimiter='\t',quotechar='"',quoting=csv.QUOTE_ALL)
writer.writerows(map(lambda line:re.split("\s\s\s\s+",line.strip()),lines_of_data))

9 Comments

sorry for my ignorance - but how do I get this 'data'. All I have is an input text file 'f' and an output csv file 'offile'?
data is an example of f.read() sin ce you never showed us what your input file looks like I assumed its contents looked something like my data variable
please post a whole textfile or at least several rows
yeah it would be better if you could provide a link(to download) ... I think pastebin is normalizing your whitespace to normal spaces(I assume the delimiter is tabs)
the link is correctly showing my rows. That is the concern here that the delimiter is not tabs correctly, each field is seperated from it's preceding field by normal spaces.
|
0
data = open('test.txt').read()
lines_of_data = data.splitlines()
tmp = []
for i in range(len(lines_of_data)):
    tmp.append(lines_of_data[i].split())    
data_df = pd.DataFrame(tmp) 
data_df.to_csv('test.csv')

1 Comment

Welcome to Stack Overflow! Code-only answers are discouraged. Please click on edit and add a paragraph or two summarizing how your code addresses the question, or perhaps explain how your answer differs from the previous answer/answers. Thanks.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.