1

I have the following table where hour is index:

Hour         date        plant1     plant2    plant3 ....
07:00:00    2019-06-23    22.1      22.8      21.4
07:03:00    2019-06-23    31.7      33.1      12.4
07:06:00    2019-06-23    11.1      12.5      11.4
07:09:00    2019-06-23    17.6      19.34     22.1
...
08:26:00    2019-06-23    11.1      12.5      11.4
08:40:00    2019-06-23    17.6      19.34     22.1
08:50:00    2019-06-23    11.1      12.5      11.4
08:59:00    2019-06-23    17.6      19.34     22.1
09:06:00    2019-06-23    11.1      12.5      11.4
09:09:00    2019-06-23    17.6      19.34     22.1

I want to change the values of plant 1 to whitespace or null between the hours 07:10 to 08:51 only for so it will look like this:

Hour         date        plant1     plant2    plant3 ....
07:00:00    2019-06-23    22.1      22.8      21.4
07:03:00    2019-06-23    31.7      33.1      12.4
07:06:00    2019-06-23    11.1      12.5      11.4
07:09:00    2019-06-23    17.6      19.34     22.1
...
08:26:00    2019-06-23              12.5      11.4
08:40:00    2019-06-23              19.34     22.1
08:50:00    2019-06-23              12.5      11.4
08:59:00    2019-06-23    17.6      19.34     22.1
09:06:00    2019-06-23    11.1      12.5      11.4
09:09:00    2019-06-23    17.6      19.34     22.1

I have tried to do this like this:

df.loc['plant1'] = df.loc['plant1'].mask((df['Hour'].between(time(7,10,0),time(8,51,0)),''))

But i'm getting keyerror for plant1 (and also not sure that this is the best/correct way to do that).

My end goal: to be able to remove values in specific column for specific index locations based on index location.

Clarification : I need the hour to be the index only (without the date)

1 Answer 1

1

If possible convert hour and date column to datetimeindex use DatetimeIndex.indexer_between_time for indices between times in strings form and then set values of column by DataFrame.loc with indexing DatetimeIndex:

df.index = pd.to_datetime(df['date']) + pd.to_timedelta(df.index.astype(str))

idx = df.index.indexer_between_time('07:10:00','08:51:00')

df.loc[df.index[idx], 'plant1'] = np.nan
print (df)
                           date  plant1  plant2  plant3  ....
2019-06-23 07:00:00  2019-06-23    22.1   22.80    21.4   NaN
2019-06-23 07:03:00  2019-06-23    31.7   33.10    12.4   NaN
2019-06-23 07:06:00  2019-06-23    11.1   12.50    11.4   NaN
2019-06-23 07:09:00  2019-06-23    17.6   19.34    22.1   NaN
2019-06-23 08:26:00  2019-06-23     NaN   12.50    11.4   NaN
2019-06-23 08:40:00  2019-06-23     NaN   19.34    22.1   NaN
2019-06-23 08:50:00  2019-06-23     NaN   12.50    11.4   NaN
2019-06-23 08:59:00  2019-06-23    17.6   19.34    22.1   NaN
2019-06-23 09:06:00  2019-06-23    11.1   12.50    11.4   NaN
2019-06-23 09:09:00  2019-06-23    17.6   19.34    22.1   NaN

In your solution if Hour is index use index.to_series() because Series.between not working with DatetimeIndex yet:

df['plant1'] = df['plant1'].mask((df.index.between(time(7,10,0),time(8,51,0))))

AttributeError: 'Index' object has no attribute 'between'

df['plant1'] = df['plant1'].mask((df.index.to_series().between(time(7,10,0),time(8,51,0))))
print (df)
                date  plant1  plant2  plant3  ....
07:00:00  2019-06-23    22.1   22.80    21.4   NaN
07:03:00  2019-06-23    31.7   33.10    12.4   NaN
07:06:00  2019-06-23    11.1   12.50    11.4   NaN
07:09:00  2019-06-23    17.6   19.34    22.1   NaN
08:26:00  2019-06-23     NaN   12.50    11.4   NaN
08:40:00  2019-06-23     NaN   19.34    22.1   NaN
08:50:00  2019-06-23     NaN   12.50    11.4   NaN
08:59:00  2019-06-23    17.6   19.34    22.1   NaN
09:06:00  2019-06-23    11.1   12.50    11.4   NaN
09:09:00  2019-06-23    17.6   19.34    22.1   NaN
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.