0

I'm working on a script to remove bad characters from a csv file then to be stored in a list.

The script runs find but doesn't remove bad characters so I'm a bit puzzled any pointers or help on why it's not working is appreciated

def remove_bad(item):
    item = item.replace("%", "")
    item = item.replace("test", "")
    return item


raw = [] 

with open("test.csv", "rb") as f:
    rows = csv.reader(f)
    for row in rows:
        raw.append((remove_bad(row[0].strip()),
                    row[1].strip().title()))

print raw
2
  • 5
    Could you please add an example of a few rows of your csv and an example of your expected output, and also say explicitly which bad chars you'd like to remove. With the information you provide, it is a bit difficult to help. Commented Jul 15, 2015 at 14:05
  • In spite of the question's title, the problem isn't that append isn't working, but that the code isn't removing all the bad characters. As @Adrianus' answer says, call remove_bad for both items in the input. Commented Jul 15, 2015 at 15:03

2 Answers 2

2

If I have a csv-file with one line:

tst%,testT

Then your script, slightly modified, should indeed filter the "bad" characters. I changed it to pass both items separately to remove_bad (because you mentioned you had to "remove bad characters from a csv", not only the first row):

import csv

def remove_bad(item):
    item = item.replace("%","")
    item = item.replace("test","")
    return item


raw = [] 

with open("test.csv", "rb") as f:
    rows = csv.reader(f)
    for row in rows:
        raw.append((remove_bad(row[0].strip()), remove_bad(row[1].strip()).title()))

print raw

Also, I put title() after the function call (else, "test" wouldn't get filtered out).

Output (the rows will get stored in a list as tuples, as in your example):

[('tst', 'T')]
Sign up to request clarification or add additional context in comments.

Comments

0

Feel free to ask questions

import re
import csv
p = re.compile( '(test|%|anyotherchars)') #insert bad chars insted of anyotherchars
def remove_bad(item):
    item = p.sub('', item)
    return item

raw =[] 

with open("test.csv", "rb") as f:
    rows = csv.reader(f)
    for row in rows:
        raw.append( ( remove_bad(row[0].strip()),
                     row[1].strip().title() # are you really need strip() without args?
                    ) # here you create a touple which you will append to array
                  )

print raw

1 Comment

Okay, here's a question: Why doesn't your code remove all the bad characters?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.