I am trying to convert survey data on the marital status which look as follows:
df['d11104'].value_counts()
[1] Married 1 250507
[2] Single 2 99131
[4] Divorced 4 32817
[3] Widowed 3 24839
[5] Separated 5 8098
[-1] keine Angabe 2571
Name: d11104, dtype: int64
So far, I did df['marstat'] = df['d11104'].cat.codes.astype('category'), yielding
df['marstat'].value_counts()
1 250507
2 99131
4 32817
3 24839
5 8098
0 2571
Name: marstat, dtype: int64
Now, I'd like to add labels to the columnmarstat, such that the numerical values are maintained, i.e. I like to identify people by the condition df['marstat'] == 1, while at the same time being having labels ['Married','Single','Divorced','Widowed'] attached to this variable. How can this be done?
EDIT: Thanks to jpp's Answer, i simply created a new variable and defined the labels by hand:
df['marstat_lb'] = df['marstat'].map({1: 'Married', 2: 'Single', 3: 'Widowed', 4: 'Divorced', 5: 'Separated'})