2

I have a pandas data frame gmat. The sample data looks like

YEAR  student score mail_id      phone            Loc
2012  abc     630   [email protected]  1-800-000-000   pqr
2012  pqr     630   [email protected]  1-800-000-000   abc

I would like to iterate through this data frame & create a dataframe from rows of this data frame in for loop & use that data frame for doing calculation.Each iteration in for loop will overwrite previous dataframe with the current row in iteration. For example my first data frame in for loop will look like

YEAR  student score mail_id      phone            Loc
2012  abc     630   [email protected]  1-800-000-000   pqr

and second dataframe after overwriting first row will look like

YEAR  student score mail_id      phone            Loc
2012  pqr     630   [email protected]  1-800-000-000   abc

So I tried following code

for row in gmat.iterrows():
    
    df=pd.DataFrame(list(row))

But while checking I'm seeing df is not populated properly. It's only showing 2 columns Can you please suggest me how to do it?

I also tried this based on Georgy's suggestion, I used for index, row in gmat.iterrows(). Here I'm getting row as a pd.Series then I'm using gmrow=pd.DataFrame(row) But my column heading of original data is coming as row. Data I'm getting as YEAR 2012 student abc score 630 mail_id [email protected] phone 1-800-000-000 Loc pqr

4
  • Possible duplicate of How to iterate over rows in a DataFrame in Pandas? Commented Feb 1, 2018 at 15:22
  • See the accepted answer above. It should be for index, row in gmat.iterrows(). In your case your row is a tuple of an integer index and a pd.Series. This is why you get those '2 columns'. Also, when you fix this, you won't need to convert row to list. Commented Feb 1, 2018 at 15:26
  • @Georgy,Please refer my original post. I' tried your suggestion but output format what I'm getting is different than what I want Commented Feb 2, 2018 at 6:29
  • gmrow=pd.DataFrame(row).T will transpose it to the format you want Commented Feb 2, 2018 at 10:17

1 Answer 1

5

You can slice your dataframe like this:

for index, row in gmat.iterrows(): x = df[index:index+1] print("print iterations:",x)

print is just an example. You can do your desired transformations with x

Sign up to request clarification or add additional context in comments.

4 Comments

There is no need to use iterrows if you never use the row. Simple for index in range(...) would be enough
Please could you elaborate use of range to iterate over a dataframe? As far as I know, iterrows returns dataframe with its original schema. Range function will take int input and will possibly throw int object is not iterable error when used on a dataframe.
I don't really understand what is not clear. Yes, iterrows yields rows of a dataframe along with corresponding indices, but use of it is justified if you actually use those rows. In your case you are using only the indices. Either use row or don't use iterrows at all. So, for example, either for _, row in df.iterrows(): print(pd.DataFrame(row).T) or for i in range(df.shape[0]): print(df[i:i+1])
Sure! Thanks for the explaination sir!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.