How to add a new column into an existing DataFrame?

Question

I am trying to add a synthetic data column to the existing the movies dataset. This new column is the gross revenue of an actor's second most recent movie.

For example:

Movie	Actor	Revenue	New Column*
A	Nic Cage	$7	$5
B	Nic Cage	$6	$4
C	Nic Cage	$5	-
D	Nic Cage	$4	-
E	Al Pacino	$3	$1
F	Al Pacino	$2	-
G	Al Pacino	$1	-

What is the most efficient way to code this in python?

This looks like homework. Can you give us your best attempt and we can make suggestions? — tdelaney
– tdelaney, Commented Nov 16, 2022 at 3:06

mozway · Accepted Answer · 2022-11-16 03:06:40Z

1

Assuming the movies are sorted, use groupby.shift

df['New Column'] = df.groupby('Actor')['Revenue'].shift(-2, fill_value='-')

Output:

  Movie      Actor Revenue New Column
0     A   Nic Cage      $7         $5
1     B   Nic Cage      $6         $4
2     C   Nic Cage      $5          -
3     D   Nic Cage      $4          -
4     E  Al Pacino      $3         $1
5     F  Al Pacino      $2          -
6     G  Al Pacino      $1          -

answered Nov 16, 2022 at 3:06

mozway

267k13 gold badges55 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to add a new column into an existing DataFrame?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related