1

I am trying to add a synthetic data column to the existing the movies dataset. This new column is the gross revenue of an actor's second most recent movie.

For example:

Movie Actor Revenue New Column*
A Nic Cage $7 $5
B Nic Cage $6 $4
C Nic Cage $5 -
D Nic Cage $4 -
E Al Pacino $3 $1
F Al Pacino $2 -
G Al Pacino $1 -

What is the most efficient way to code this in python?

2
  • This looks like homework. Can you give us your best attempt and we can make suggestions? Commented Nov 16, 2022 at 3:06
  • Second most recent? I don't see any dates here. Commented Nov 16, 2022 at 3:06

1 Answer 1

1

Assuming the movies are sorted, use groupby.shift

df['New Column'] = df.groupby('Actor')['Revenue'].shift(-2, fill_value='-')

Output:

  Movie      Actor Revenue New Column
0     A   Nic Cage      $7         $5
1     B   Nic Cage      $6         $4
2     C   Nic Cage      $5          -
3     D   Nic Cage      $4          -
4     E  Al Pacino      $3         $1
5     F  Al Pacino      $2          -
6     G  Al Pacino      $1          -
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.