I have text file, its size is 300 MB. I want to read it and then print 50 most frequently used words. When i run the program it gives me MemoryError. My code is as under:-
import sys, string
import codecs
import re
from collections import Counter
import collections
import itertools
import csv
import re
import unicodedata
words_1800 = []
with open('E:\\Book\\1800.txt', "r", encoding='ISO-8859-1') as File_1800:
for line in File_1800:
sepFile_1800 = line.lower()
words_1800.extend(re.findall('\w+', sepFile_1800))
for wrd_1800 in [words_1800]:
long_1800=[w for w in words_1800 if len(w)>3]
common_words_1800 = dict(Counter(long_1800).most_common(50))
print(common_words_1800)
It give me the following error:-
Traceback (most recent call last):
File "C:\Python34\CommonWords.py", line 17, in <module>
words_1800.extend(re.findall('\w+', sepFile_1800))
MemoryError
for wrd_1800 in [words_1800]supposed to do, exactly?words_1800.extend(re.findall('\w+', sepFile_1800))is giving an endless loop.