I have csv file
ID,"address","used_at","active_seconds","pageviews"
0a1d796327284ebb443f71d85cb37db9,"vk.com",2016-01-29 22:10:52,3804,115
0a1d796327284ebb443f71d85cb37db9,"2gis.ru",2016-01-29 22:48:52,214,24
0a1d796327284ebb443f71d85cb37db9,"yandex.ru",2016-01-29 22:14:30,4,2
0a1d796327284ebb443f71d85cb37db9,"worldoftanks.ru",2016-01-29 22:10:30,41,2
and I need remove strings, that contains some words. There are 117 words.
I try
for line in df:
if 'yandex.ru' in line:
df = df.replace(line, '')
but to 117 words it works too slowly and after it I create pivot_table and words, that I try to delete, contains in columns.
aaa 10ruslake.ru youtube.ru 1tv.ru 24open.ru
0 0025977ab2998580d4559af34cc66a4e 0 0 34 43
1 00c651e018cbcc8fe7aa57492445c7a2 230 0 0 23
2 0120bc30e78ba5582617a9f3d6dfd8ca 12 0 0 0
3 01249e90ed8160ddae82d2190449b773 25 0 13 25
That columns contain only 0
How Can I do it faster and remove lines so as that words are not be in columns?
pattern = '|'.join(words)' for col in df: df[col] = df.str.replace(pattern, '', case=False)