Python script to read a text file and write into a csv file

Question

I have a text file in which each row has multiple words (which I want to consider as columns). Now I want to read all the data from this text file and create a csv file with rows and columns. I am written the code till here -

import csv
f=open("text.txt", "r")
reader=csv.reader(f)
offile=open("output.csv","wb")
writer=csv.writer(offile,delimiter='\t',quotechar='"',quoting=csv.QUOTE_ALL)
for row in reader:
 ........

f.close()
offile.close()

I am not able to understand how to divide each row into columns and write this columns and rows back while writing a csv file? I am a newbie to python, so a good example I will be very greatful.

Thanks

please post a linkt to test.txt if you want someone to be able to give you more than passing help — Joran Beasley
– Joran Beasley, Commented Apr 1, 2014 at 19:09

FrobberOfBits · Accepted Answer · 2014-04-01 19:01:36Z

1

Try splitting the lines via a regular expression:

line = "Foo bar baz quux"
import re
pieces = re.split("\s+", line)
print pieces

This results in

['Foo', 'bar', 'baz', 'quux']

The regular expression used above matches for multiple (+) white space characters (\s)

answered Apr 1, 2014 at 19:01

FrobberOfBits

18.1k5 gold badges60 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

user297171 Over a year ago

Do you know that split() without argument is equivalent to split("\s+") but faster?

Joran Beasley Over a year ago

why would you do this instead of line.split() ??

FrobberOfBits Over a year ago

Question author didn't specify much detail on how they wanted to split, other than "divide rows into columns". Yeah, there's a bunch of different ways you can split by whitespace. I did it with the re module specifically to draw attention to the use of a regular expression. That comes with it flexibility to split in any number of different ways. The beginner nature of the question led me to believe that we couldn't take it for granted he/she knows what regular expressions are.

user3486471 Over a year ago

The input text file has random different types of data in rows like - data1 data2 data3 data4 data5 data6 data7 All I want is to consider all the 'data' as columns in a single row and write it as a single row into a csv file. Hope this helps more

FrobberOfBits Over a year ago

@user3486471 are you saying it has strange delimiters that separate fields? Do you know what they are? Is this columnar data where say the first 10 columns is one field? For follow up questions, you might want to pastebin a sample of the data you're trying to process.

|

Joran Beasley · Accepted Answer · 2014-04-01 19:39:02Z

0

import re
data = open("test.txt").read()
lines_of_data = data.splitlines()
writer=csv.writer(offile,delimiter='\t',quotechar='"',quoting=csv.QUOTE_ALL)
writer.writerows(map(lambda line:re.split("\s\s\s\s+",line.strip()),lines_of_data))

edited Apr 1, 2014 at 19:39

answered Apr 1, 2014 at 19:01

Joran Beasley

114k13 gold badges167 silver badges187 bronze badges

9 Comments

user3486471 Over a year ago

sorry for my ignorance - but how do I get this 'data'. All I have is an input text file 'f' and an output csv file 'offile'?

Joran Beasley Over a year ago

data is an example of f.read() sin ce you never showed us what your input file looks like I assumed its contents looked something like my data variable

Joran Beasley Over a year ago

please post a whole textfile or at least several rows

Joran Beasley Over a year ago

yeah it would be better if you could provide a link(to download) ... I think pastebin is normalizing your whitespace to normal spaces(I assume the delimiter is tabs)

user3486471 Over a year ago

the link is correctly showing my rows. That is the concern here that the delimiter is not tabs correctly, each field is seperated from it's preceding field by normal spaces.

|

Taehee Jeong · Accepted Answer · 2018-11-06 01:06:09Z

0

data = open('test.txt').read()
lines_of_data = data.splitlines()
tmp = []
for i in range(len(lines_of_data)):
    tmp.append(lines_of_data[i].split())    
data_df = pd.DataFrame(tmp) 
data_df.to_csv('test.csv')

answered Nov 6, 2018 at 1:06

Taehee Jeong

711 silver badge2 bronze badges

1 Comment

Mozahler Over a year ago

Welcome to Stack Overflow! Code-only answers are discouraged. Please click on edit and add a paragraph or two summarizing how your code addresses the question, or perhaps explain how your answer differs from the previous answer/answers. Thanks.

Collectives™ on Stack Overflow

Python script to read a text file and write into a csv file

3 Answers 3

9 Comments

9 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

9 Comments

9 Comments

1 Comment

Linked

Related