0

I have found all the values and everything and I tried sort, but it sorts separately rather than together. I want to sort by years, and grades should just follow the values of years. However, when I do sort(years), it would only sort years, leaving grades as it is.

when opening file, it would give me something like:

Year,Grade
2000,84
2001,34
2002,82
2012,74
2008,90

and so forth. So I have calculated average and everything.

years, average_grades = [],[]
avg = []
d = {}

with open(file,'r') as f:
    next(f)
    for line in f:
        year, grade = (s.strip() for s in line.split(','))
        if year in d:
            d[year][0] += int(grade)
            d[year][1] += 1
        else:
            d[year] = [int(grade),1]

    for year, grades in d.items():
        years.append(str(year))
        average_grades.append(float(grades[0]) / grades[1])

    return years, average_grades

Without sort, it would give me similar to this:

2001 74.625
2006 72.241
2012 70.875
2017 69.1981
2005 72.5
2008 71.244
2014 73.318
2004 72.1
2007 72.88
2000 73.1

With years.sort(), it would give me similar to this:
2000 74.625
2001 72.241
2002 70.875
2003 69.1981
2004 72.5
2005 71.244
2006 73.318
2007 72.1

So the sort will only work for years, but won't do that for grades. This problem has been bugging me for so long time now. I am not planning to use pandas.

3
  • Can you post a example of data Commented Oct 16, 2017 at 14:48
  • I have provided example data. Commented Oct 16, 2017 at 14:51
  • The problem is that you are returning two independent data structures. The returned years and average_grades are not related when you return them. Commented Oct 16, 2017 at 14:53

4 Answers 4

1

Use zip to join them as a tuple and then sort.

Example :

>>> y = [3, 2, 4, 1, 2]
>>> g = [0.1, 0.4, 0.2, 0.7, 0.1]

>>> mix = list(zip(y,g))
>>> mix
=> [(3, 0.1), (2, 0.4), (4, 0.2), (1, 0.7), (2, 0.1)]

>>> sorted(mix)
=> [(1, 0.7), (2, 0.1), (2, 0.4), (3, 0.1), (4, 0.2)]

#print in your order :

>>> for ele in sorted(mix): 
        print(ele[0],ele[1]) 

1 0.7
2 0.1
2 0.4
3 0.1
4 0.2

Note that for the year 2, there are two values of 0.1 and 0.4 and that it handles it giving more preference to the year and next grades.

Sign up to request clarification or add additional context in comments.

Comments

0

You want to add this line before the return statement:

years, average_grades = zip(*sorted(zip(years, average_grades), key=lambda p: p[0]))

What does this do?

The inner zip(years, average_grades) tells python to put together each element of the iterables years and average_grades as an array of tuples.

sorted(..., key=lambda p: p[0]) is the sorted utility, except now that it's operating on the pair, it needs to know how to sort the pair. So we pass it a lambda function that says "look at the first part."

The outer zip(*...) takes the results returned from the sorted, which is a list of tuples, and converts it back to two lists. The * tells it to treat the list as a bunch of arguments, so you're passing in pairs to zip. zip takes any number of tuple arguments, and splits them into its component parts. In this case, it's taking the ten pairs and splitting it into 2 tuples of length 10 each.

As long as your iterables are of the same length, this is a "basic" mechanism to sort them together.

2 Comments

Not a great answer: If OP has control over posted code, it is much cleaner to keep (year, grade) tuples together. Then sort, then un-zip. No downvote though.
The only difference between maintaining the variables separately and zipping them is the use of zip. There are plenty of real-world examples where you'll have two separate sets of data. @Unapiedra, please remember the goal of Stack Overflow is to create a repository of knowledge for future users. It is valid to assert that in this instance, not separating the two lists would be the better solution; it is not valid to claim that an answer that provides enough information to help future questioners is "not a great answer" because of an implementation you provide for this specific instance.
0

I hope this example will be helpful, so:

years = [2001,2000,2002]
average_grades = [5,10,15]
result = zip(years,average_grades)
for item in sorted(result, key=lambda x: x[0]):
    print('{} {}'.format(*item))
#2000 10
#2001 5
#2002 15

Comments

0

The alternative solutions are taking the results and zipping them together. As you seem to have control of the reading of the file, I suggest to instead never split the years and grades apart.

In my opinion this is easier than later combining the two lists with zip.

years, average_grades = [],[]
avg = []
d = {}

with open(file,'r') as f:
    next(f)
    for line in f:
        year, grade = (s.strip() for s in line.split(','))
        if year in d:
            d[year][0] += int(grade)
            d[year][1] += 1
        else:
            d[year] = [int(grade),1]

# Iterator-Expression to convert 'd' dictionary into list of tuples.
# Puts (year, average grade) into a new list.
year_grade = [(year, float(grade_tuple[0]) / grade_tuple[1]) \
               for year, grade_tuple in d.items()]

# Sorting is optional, if you return the list of tuples.
# Use 'key=lambda ...' to sort over the year (the first element of the tuple).
# Technically, specyfing the 'key' is not necessary as the default would be
# to sort over the first element first.
year_grade.sort(key=lambda x: x[0])

return year_grade
# Alternatively, return the list of tuples as a list of two tuples: years, grades
return zip(*year_grade)

Other improvements

You can use a defaultdict to avoid the if year in d block:

from collections import defaultdict

d = defaultdict(lambda: [0, 0])

with open(fname,'r') as f:
    next(f)
    for line in f:
        year, grade = (s.strip() for s in line.split(','))
        d[year][0] += int(grade)
        d[year][1] += 1

    def avg(t):
        return float(t[0]) / t[1]
    year_grade = [(y, avg(g)) for y, g in d.items()]
    year_grade.sort()

    return zip(*year_grade)  # Python3: tuple(zip(*year_grade))

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.