0

I have a Pandas dataframe with checkdataframe.shape (68125, 109). I want to perform a Operation in all the columns like I did it below for a single list.

def alter_column(column,batchSize=10):
return_list=[]
for idx,value in enumerate(column): 
        if (idx+1)%batchSize==1: 
            return_list.append(value)
        else:
            return_list.append(np.nan)
return return_list

which Returns a list with values removed over certain intervals of 10 like this Output

['175,5200',nan,nan,nan,nan,nan,nan,nan,nan,nan,'175,5200',nan,nan,nan,nan,nan,nan,nan,nan,nan,'180,0000']

I wanted it to do it over entire dataframe . i tried df.iteritems and df.iterrows but it Shows error. Any possible solution or way to do it?

eg:df['column1']=[1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2]
   df['column2']=[3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4]
expected_output:
column1=['1',nan,nan,nan,nan,nan,nan,nan,nan,nan,'2',nan,nan,nan,nan,nan,nan,nan,nan,nan] column2=['3',nan,nan,nan,nan,nan,nan,nan,nan,nan,'4',nan,nan,nan,nan,nan,nan,nan,nan,nan]   

But my real dataset has 109 columns

2
  • Please provide a small data sample, as well as the required output. See stackoverflow.com/questions/20109391/… Commented Jun 25, 2020 at 12:49
  • @Roy2012 I added a sample data and output. Commented Jun 25, 2020 at 13:04

1 Answer 1

1

If the index of your dataframe is 0 .. n you can apply this:

df[~df.index.isin(np.arange(0, df.shape[0], batchSize))] = np.nan

This way you keep only every 10 rows as not np.nan

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.