1

I have a dataframe like this, where the codes column is currently strings.

Station Codes
1 1,2
1 1
2 1
2 2,5
2 2,3
3 1

I want to see the count of each code ordered by station. I have tried to use the explode function but the default behavior is to overwrite all strings with only one number as NaN.

Station Codes Count
1 1 2
1 2 1
2 1 1
2 2 2
2 3 1
2 5 1
3 1 1

2 Answers 2

3
print(
    df.assign(Codes=df.Codes.str.split(","))
    .explode("Codes")
    .groupby(["Station", "Codes"], as_index=False)
    .size()
    .rename(columns={"size": "Count"})
)

Prints:

   Station Codes  Count
0        1     1      2
1        1     2      1
2        2     1      1
3        2     2      2
4        2     3      1
5        2     5      1
6        3     1      1
Sign up to request clarification or add additional context in comments.

Comments

2
df['Codes'] = df['Codes'].str.split(',')
df.explode('Codes').groupby('Station')['Codes'].value_counts().reset_index(name='Count')

1 Comment

I have tried using this approach as well, but using .split(",") results in NaN values in the cells that only have one number (no comma)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.