Fill Empty Panda Dataframe Using Loop Method

Question

I am currently working with some telematics data where the trip id is missing. Trip id is unique. 1 trip id contains multiple of rows of data consisting i.e gps coordinate, temp, voltage, rpm, timestamp, engine status (on or off). The data pattern indicate time of engine status on and off, can be cluster as a unique trip id. Though, I have difficulty to translate the above logic in order to generate these tripId.

Tried to use few pandas loop methods but keep failing.

import pandas as pd
inp = [{'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'ON','tripID':''},
       {'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'OFF','tripID':''},
       {'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'ON','tripID':''},
       {'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'ON', 'tripID':''},
       {'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'OFF', 'tripID':''},
       {'Ignition_Status':'ON', 'tripID':''},{'Ignition_Status':'OFF', 'tripID':''}]

test = pd.DataFrame(inp)
print (test)

Approach Taken

n=1

for index, row in test.iterrows():
test['tripID']=np.where(test['Ignition_Status']=='ON',n,n)
n=n+1

Expected Result

expected result

anky · Accepted Answer · 2019-07-17 11:29:56Z

3

Use series.eq() to check for OFF and series.shift() with series.cumsum():

test=test.assign(tripID=test.Ignition_Status.eq('OFF')
                    .shift(fill_value=False).cumsum().add(1))

   Ignition_Status  tripID
0               ON       1
1               ON       1
2               ON       1
3              OFF       1
4               ON       2
5               ON       2
6               ON       2
7               ON       2
8               ON       2
9              OFF       2
10              ON       3
11             OFF       3

answered Jul 17, 2019 at 11:29

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

m36a Over a year ago

will try this code later in my working df and will let you know the outcome

adhg Over a year ago

@anky_91, I get: TypeError: shift() got an unexpected keyword argument 'fill_value'

anky Over a year ago

@adhg what is the version of pandas pd.__version__ ? you can also use .shift().fillna(False) for lower versions

anky Over a year ago

@adhg I meant pandas version, for lower versions use test=test.assign(tripID=test.Ignition_Status.eq('OFF').shift().fillna(False).cumsum().add(1))

m36a Over a year ago

thanks @anky_91 , the code also work in my working df

Collectives™ on Stack Overflow

Fill Empty Panda Dataframe Using Loop Method

1 Answer 1

5 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Related