Loading big CSV file with pandas

Question

I am trying to load a csv file (around 250 MB) as dataframe with pandas. In my first try I used the typical read_csv command but I receive an Error memory. I have tried the approach mentioned in Large, persistent DataFrame in pandas using chunks:

x=pd.read_csv('myfile.csv', iterator=True, chunksize=1000)
xx=pd.concat([chunk for chunk in x], ignore_index=True)

but when I tried to concatenate I received the following error: Exception: "All objects passed were None". In fact I can not access the chunks

I am using winpython 3.3.2.1 for 32 bits with pandas 0.11.0

Did you resolve this issue? Did you upgrade to pandas 0.12.0? — smci
– smci, Commented Oct 4, 2013 at 5:42
Yes, I install last winpython 64 bits version and it worked withmy files. I still have to tested with bigger files — user2082695
– user2082695, Commented Oct 9, 2013 at 9:30

w-m · Accepted Answer · 2013-07-30 16:13:10Z

2

I suggest that you install the 64 Bit version of winpython. Then you should be able to load a 250 MB file without problems.

answered Jul 30, 2013 at 16:13

w-m

11.3k1 gold badge46 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Norman · Accepted Answer · 2016-04-12 20:47:14Z

0

I'm late, but the actual problem with the posted code is that using pd.concat([chunk for chunk in x]) effectively cancels any benefit of chunking because it concatenates all those chunks into one big DataFrame again.
That probably even requires twice the memory temporarily.

answered Apr 12, 2016 at 20:47

Norman

1,9751 gold badge20 silver badges25 bronze badges

Collectives™ on Stack Overflow

Loading big CSV file with pandas

2 Answers 2

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Linked

Related