I have the following pandas DataFrame:
ID COL1 COL2
123 1 ABC
123 1 CCC
123 NaN AVV
345 2 FGG
345 NaN FRG
345 NaN FGT
I need to substitute all NaN values in Col1 based on the same ID in order to get this result:
ID COL1 COL2
123 1 ABC
123 1 CCC
123 1 AVV
345 2 FGG
345 2 FRG
345 2 FGT
I can write for loop, but it will take a long time for my dataset to execute the script. Is there any conditional replace function?
df.groupby('ID').ffill().bfill()give what you need?df.sort_values(['ID', 'COL1']).ffill(), which seems to be 3 ~ 4 times faster than the above method. It sorts the NaN values to the end of the data frame and use onlyffill()method to fill missing values.NaNI haveNot-Defined. Can I still useffill()?Not-Defined? Is it a string ornull?