0

I have a .txt file like this:

ancient 0.4882
detained 5.5512
neighboring 2.9644
scores 0.5951
eggs 0.918
excesses 3.0974
proceedings 0.7446
menem 1.7971

I want to display the top 3 words by comparing their value in one list and the remaining words in another list.

i.e., the output for this example should be:

[detained, excesses, neighboring] & [menem, eggs, proceedings, scores, ancient]

How to do that?

EDIT:

I forgot to mention one thing: I want to consider only those words that have a value great than 0.5 How to do that?

2
  • Where are you stuck? Do you know how to open and read the file? Commented Nov 20, 2014 at 2:27
  • @monkut Ya, I was doing something like, having one list have all the words, and the second list have all the float values, and then sort the second list.. but then I'm lost! Commented Nov 20, 2014 at 2:33

3 Answers 3

1
import csv
with open('x.txt') as f:
    # use space as delimiter
    reader = csv.reader(f, delimiter=' ')
    # sort by the value in the second place of each line i.e. x[1]
    s = sorted(reader, key=lambda x: x[1], reverse=True)
    # filter only grater than 0.5 and take the first value only
    l = [x[0] for x in s if float(x[1])>0.5]
    print l[:3]
    print l[3:]
Sign up to request clarification or add additional context in comments.

2 Comments

I had one more thing which I forgot to ask. Could you help me with that?
I want to consider only those words that have a value great than 0.5 How to do that?
1
import csv    
with open('inputFile.csv','r') as inputFile:
    reader = csv.reader(inputFile, delimiter = " ")    
    word = dict()    
    for line in reader:
        if float(line[1]) > 0.5:
            word[line[0]] = float(line[1])

    sortedArray = sorted(word.iteritems(), key=lambda x:-x[1])
    maxWords = sortedArray[:3]
    Remaining = sortedArray[3:]    
    print maxWords
    print Remaining

4 Comments

It would be great if you could explain this too. I have a .txt file, not a .csv file by the way.
You might want to do the sorting only once. Like sorted_words = sorted(word.iteritems(), key=lambda x:-x[1]) and then max_words = sorted_words[:3] and remaining = sorted_words[3:].
Could you tell me one more thing: I want to consider only those words that have a value great than 0.5 How to do that?
I would simply eliminate those words before putting them in the dictionary. I updated the answer to reflect the above comments. Also, note that you can use a txt file instead of inputFile.csv.
1

The answers using csv are more concise than mine but here is another approach.

from operator import itemgetter

with open('file_list_data.txt', 'r') as f:
    lines = f.readlines()

records = [l.split() for l in lines]
records_with_numbers = [(r[0], float(r[1])) for r in records if float(r[1]) > 0.5]

sorted_records = sorted(records_with_numbers, key=itemgetter(1), reverse=True)

top_3 = [word for (word, score) in sorted_records[0:3]]
rest = [word for (word, score) in sorted_records[3:]]

2 Comments

I want to consider only those words that have a value great than 0.5 How to do that?
Added a filter to only keep words with value greater than 0.5 when producing the variable records_with_numbers.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.