Hello I need to read text with almost 300 000 words and determine global frequency of each word from input dictionary and make one array.. I have file of sentences and dictionary file with words and their frequency... This is my code:
const sentenceFreq = [];
let text = [];
for (const sentence of srcSentences) {
// remove special characters
const sentenceWithoutSpecial = sentence.srcLangContent
.replace(/[`~!@#$%^&*„“()_|+\-=?;:'",.<>\{\}\[\]\\\/]/gi, "");
text = text + sentenceWithoutSpecial + " ";
}
const words = text.replace(/[.]/g, "").split(/\s/);
words.map(async (w, i)=>{
const frequency = eng.filter((x) => x.word.toLowerCase() === w.toLowerCase());
if (frequency[0]) {
sentenceFreq.push({[frequency[0].freq]: w});
} else {
sentenceFreq.push({0: w});
}
});
This is english dictionary
let eng = [
{word:"the",freq:23135851162},
{word:"of",freq:13151942776},
{word:"and",freq:12997637966},
{word:"to",freq:12136980858},
{word:"a",freq:9081174698},
{word:"in",freq:8469404971}
....]
So if my text is " Today is beautiful day" code should search through each word find it in eng dictionary and return its frequency so result would be [{1334:"today"},{521:"is"},{678854:"beautiful"},{9754334:"day"}]
So this numbers 1334,521... are frequencies found in eng dictionary.
The problem is this is too slow since I have 300 000 words... is any more efficient way to read array of words and to find it in array of file english words...
So if I have array ['today', 'is', 'good', 'day']can I automatically search for all values in eng array instead of going through each word using loop?