I have a dataset in the following format:
[[ 226 600 3.33 915. 92.6 98.6 ]
[ 217 700 3.34 640. 93.7 98.5 ]
[ 213 900 3.35 662. 88.8 96. ]
...
[ 108 600 2.31 291. 64. 70.4 ]
[ 125 800 3.36 1094. 65.5 84.1 ]
[ 109 400 2.44 941. 52.3 68.7 ]]
Each column is a separate criteria that has its own value range. How can I impute values that are 0 to a value that is more than zero based on its column range? In other words the worst minimal value other than 0.
I have written the following code but it can only either change the 0 to the minimal value in the column (which is of course 0) or max. The max varies by column. Thanks for your help!
# Impute 0 values -- give them the worst value for that column
I, J = np.nonzero(scores == 0)
scores[I,J] = scores.min(axis=0)[J] # can only do min or max