1

How can I create a new data frame based on the existing columns? It should calculate the average of the column 'a' for each same x. For example: a_new = sum the 'a' values and divide 3 where x=1. And also, for x=2, x=3,....

import pandas as pd
data = {'x': [ 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], 'a': [0.4, 0.88, 0.2, 0.1, 0.75, 0.98, 0.33, 0.22, 0.15, 0.14, 0.73, 0.25], 'year': [2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002]}   
df = pd.DataFrame(data)
df

    x    a      year
0   1   0.40    2000
1   2   0.88    2000
2   3   0.20    2000
3   4   0.10    2000
4   1   0.75    2001
5   2   0.98    2001
6   3   0.33    2001
7   4   0.22    2001
8   1   0.15    2002
9   2   0.14    2002
10  3   0.73    2002
11  4   0.25    2002

Expected Output:

    x   a_new
0   1   0.30
1   2   0.66
2   3   0.42
3   4   0.19
1
  • 1
    Look into pandas.groupby() that will do what you need Commented Apr 13, 2022 at 14:17

1 Answer 1

1

This might be what you're after.

df.groupby(['x']).mean()['a']
x
1    0.433333
2    0.666667
3    0.420000
4    0.190000
Name: a, dtype: float64

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.