1

This is a fragment of the df when visualized

I want to calculate the average number of successful Rattatas catches hourly for this whole dataset. I am looking for an efficient way to do this by utilizing pandas--I'm new to Python and pandas.

4
  • If you're code already works, you better ask this question on Code Review. Commented Jan 5, 2017 at 10:54
  • It doesn't work :( Commented Jan 5, 2017 at 11:02
  • Can you upload you code (minimal reproducible example)? Commented Jan 5, 2017 at 12:19
  • Do you hack Pokemon Go? :) Commented Jan 5, 2017 at 13:12

2 Answers 2

1

You don't need any loops. Try this. I think logic is rather clear.

import pandas as pd

#read csv
df = pd.read_csv('pkmn.csv', header=0)

#we need apply some transformations to extract date from timestamp
df['time'] = df['time'].apply(lambda x : pd.to_datetime(str(x)))
df['date'] = df['time'].dt.date

#main transformations
df = df.query("Pokemon == 'rattata' and caught == True").groupby('hour')
result = pd.DataFrame()
result['caught total'] = df['hour'].count()
result['days'] = df['date'].nunique()
result['caught average'] = result['caught total'] / result['days']
Sign up to request clarification or add additional context in comments.

Comments

0

If you have your pandas dataframe saved as df this should work:

        rats = df.loc[df.Pokemon == "rattata"] #Gives you subset of rows relating to Rattata

        total = sum(rats.Caught) #Gives you the number caught total

        diff = rats.time[len(rats)] - rats.time[0] #Should give you difference between first and last 

        average = total/diff #Should give you the number caught per unit time

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.