4

I am trying to load a csv file (around 250 MB) as dataframe with pandas. In my first try I used the typical read_csv command but I receive an Error memory. I have tried the approach mentioned in Large, persistent DataFrame in pandas using chunks:

x=pd.read_csv('myfile.csv', iterator=True, chunksize=1000)
xx=pd.concat([chunk for chunk in x], ignore_index=True)

but when I tried to concatenate I received the following error: Exception: "All objects passed were None". In fact I can not access the chunks

I am using winpython 3.3.2.1 for 32 bits with pandas 0.11.0

2
  • Did you resolve this issue? Did you upgrade to pandas 0.12.0? Commented Oct 4, 2013 at 5:42
  • Yes, I install last winpython 64 bits version and it worked withmy files. I still have to tested with bigger files Commented Oct 9, 2013 at 9:30

2 Answers 2

2

I suggest that you install the 64 Bit version of winpython. Then you should be able to load a 250 MB file without problems.

Sign up to request clarification or add additional context in comments.

Comments

0

I'm late, but the actual problem with the posted code is that using pd.concat([chunk for chunk in x]) effectively cancels any benefit of chunking because it concatenates all those chunks into one big DataFrame again.
That probably even requires twice the memory temporarily.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.