I'm doing the splitting of the words from the text file in python. I've receive the number of row (c) and a dictionary (word_positions) with index. Then I create a zero matrix (c, index). Here is the code:
from collections import defaultdict
import re
import numpy as np
c=0
f = open('/Users/Half_Pint_Boy/Desktop/sentenses.txt', 'r')
for line in f:
c = c + 1
word_positions = {}
with open('/Users/Half_Pint_Boy/Desktop/sentenses.txt', 'r') as f:
index = 0
for word in re.findall(r'[a-z]+', f.read().lower()):
if word not in word_positions:
word_positions[word] = index
index += 1
print(word_positions)
matrix=np.zeros(c,index)
My question: How can I populate the matrix to be able to get this: matrix[c,index] = count, where c - is the number of row, index -the indexed position and count -the number of counted words in a row
lines, you can get the number of words just by using,len(lines.split())(length of the array made from the string split at each whitespace)