0

Reading from yahoo finance download ohlcv for nvidia, I am creating a column for signal buy/dontbuy, when I try to define which passes the avg>volume test everything either comes out all 'buy' or don't buy.

df=pd.read_csv('NVDA.csv',dtype={'label':str})
df['Price%delta']=((df['Close']/df['Open'])*100)                       

df['Avg_volume']=df['Volume'].rolling(7).mean()

df['Signal']=0

for index, row in df.iterrows():
    if row['Volume'] > row['Avg_volume']:
    df['Signal']='Buy'
    else:
        df['Signal']='Dont Buy'
4
  • 1
    What's your question? Commented Nov 30, 2018 at 20:21
  • Use something like df['Signal']=np.where(row['Volume'] > row['Avg_volume'],'Buy','Dont Buy'). Avoid using for loops for this when it can be done in a vectoized way Commented Nov 30, 2018 at 20:23
  • 1
    @IanQuah Notice how in each iteration OP is just setting the whole series and not every row. Hence he is getting all duplicate values which is either all Buy or Dont Buy based on his last iteration Commented Nov 30, 2018 at 20:24
  • thanks i appreciate your advice Commented Nov 30, 2018 at 20:29

3 Answers 3

1

You don't really need the for loop at all:

mask = df["Volume"] > df["Avg_volume"] 

df.loc[mask, "Signal"] = "Buy"
df.loc[~mask, "Signal"] = 'Don't buy'
Sign up to request clarification or add additional context in comments.

Comments

1

You are not specifying any index where to assign 'Buy' or 'Don't buy'. Use loc instead:

for index, row in df.iterrows(): 
    if row['Volume'] > row['Avg_volume']:
        df.loc[index, 'Signal']='Buy'
    else:
        df.loc[index, 'Signal']='Dont Buy'

6 Comments

it finally worked thank you nixon
Good to know. Please let me know iI solved it for you by setting the answer as correct.
Just an FYI - this solution will be considerably slower than the vectorized solution I have posted for larger dataframes.
I know I'm aware @rahlf23 it can be vectorized, however, given that the user is unaware of how to properly index data, i found more appropriate to give some help on correcting the actual code
Not saying your answer is incorrect @nixon, it definitely produces the desired output for the OP. Just pointing something out :)
|
1

A vectorized solution using np.where():

df['Signal'] = np.where(df['Volume'] > df['Avg_volume'], 'Buy', 'Dont Buy')

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.