1

Consider the avocado dataset. How to find the mean of 'Total Volume' from 29 dec 2015 to 13 april 2018 of Boston 'region'.

I am trying to calculate it by df.loc()

import pandas as pd
data_new = data.loc[(data['Date']>='2015-12-29')&(data['Date']<='2018-04-13')&(data['region']=='Boston')]
print(data_new.mean())

Is it right or I have to provide the axis for the mean()?

dataset - https://www.kaggle.com/neuromusic/avocado-prices

4
  • Consider the avocado dataset. What avocado dataset? Is it right or I have to provide the axis for the mean()? That's pretty easy to figure out, does it produce the correct result? Commented Jan 12, 2020 at 21:57
  • this question is from hackerrank. The problem statement was to calculate the mean of 'Total Volume' from 29 dec 2015 to 13 april 2018 of Boston 'region'. the above solution was giving the wrong answer verdict. Commented Jan 12, 2020 at 22:00
  • That's information which should be in your post, then. Commented Jan 12, 2020 at 22:01
  • @RishiSahu you're pretty close my man, just add the column name to the end and add .mean() Commented Jan 12, 2020 at 22:37

1 Answer 1

2

If you do not put the column there, then it would return the mean of all columns. Btw one prettier way for convenience would be taking out the condition part if you have many. It would be read easier.

condition = (data['Date']>='2015-12-29')&\
                     (data['Date']<='2018-04-13')&\
                     (data['region']=='Boston')
mean_total_vol = data[condition]['Total Volume'].mean()
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.