I have a dataset, where authors are ranked by the order of authorship (1, 2, 3, etc).
Authorid Author Article Articleid Rank
1 John article 1 1 1
1 John article 2 2 2
1 John article 3 3 3
1 John article 4 4 3
2 Mary article 5 5 1
2 Mary article 6 6 2
2 Mary article 7 7 1
2 Mary article 8 8 8
I want to create three more Boolean columns If_first, If_second, If_last.
The purpose of this - I want to show if the author is ranked 1, 2, or last in the article.
The last means the maximum number in Rank column (the maximum number for this Authorid in the column Rank).
I can do If_first and If_second, that is pretty easy, but not sure how to resolve If_last.
df.loc[df['Rank'] == 1, 'If_first'] = 1
df.loc[df['Rank'] != 1, 'If_first'] = 0
df.loc[df['Rank'] == 2, 'If_second'] = 1
df.loc[df['Rank'] != 2, 'If_second'] = 0
Two rules here
If_first=if_last- treat him asif_firstIf_second=if_last- treat him asif_second
Expected output:
Authorid Author Article Articleid Rank If_first If_second If_last
1 John article 1 1 1 1 0 0
1 John article 2 2 2 0 1 0
1 John article 3 3 3 0 0 1 (third is the last here)
2 Mary article 5 5 1 1 0 0
2 Mary article 6 6 2 0 1 0
2 Mary article 7 7 3 0 0 0 (third is not the last here, because of the fourth below, all zeros)
2 Mary article 8 8 4 0 0 1 (fourth is the last here)