0

I have a problem.

Check out the dataframe below

Company Year  Status
A       2021  Unpaid
B       2021  Paid
C       2021  Unpaid
D       2021  Paid
A       2020  Unpaid
B       2020  Unpaid
C       2020  Paid
D       2020  Paid

I want to get a list of the companies that were unpaid in 2020 but paid in 2021 (so just C). I can do this in excel with no problem but can't figure it out in pandas. Am stumped.

1
  • I have tried this df[(df['Year'].isin(['2020']) & df['Status'].isin(['Unpaid'])) & (df['Year'].isin(['2021']) & df['Status'].isin(['PAID']))] but I get blank Commented Aug 26, 2022 at 0:25

1 Answer 1

1

You can pivot then use query

import pandas as pd


data = {
    "Company": ["A", "B", "C", "D", "A", "B", "C", "D"],
    "Year": [2021, 2021, 2021, 2021, 2020, 2020, 2020, 2020],
    "Status": ["Unpaid", "Paid", "Unpaid", "Paid", "Unpaid", "Unpaid", "Paid", "Paid"]
}

answer = (
    pd
    .DataFrame(data)
    .pivot_table(index="Company", columns="Status", values="Year")
    .reset_index()
    .query("Paid == 2020 & Unpaid == 2021")
    ["Company"].tolist()
)
print(answer)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.