Skip to content

Conversation

@jreback
Copy link
Contributor

@jreback jreback commented Apr 26, 2016

In [4]: DataFrame(np.random.randn(1000000,1)).to_csv('test.csv',index=False)

branch

In [1]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 65.74 MiB, increment: 1.59 MiB

In [2]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 65.89 MiB, increment: 0.22 MiB

In [3]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 65.98 MiB, increment: 0.28 MiB

master

In [1]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 169.84 MiB, increment: 105.79 MiB

In [2]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 171.27 MiB, increment: 24.11 MiB

In [3]: %memit pd.read_csv('test.csv',skiprows=999999)
peak memory: 173.39 MiB, increment: 24.63 MiB
@jreback jreback added Performance Memory or execution speed performance IO CSV read_csv, to_csv labels Apr 26, 2016
@jreback jreback added this to the 0.18.1 milestone Apr 26, 2016
@jreback
Copy link
Contributor Author

jreback commented Apr 26, 2016

@gfyoung I believe this is handled internally in the c-parser.

@gfyoung
Copy link
Member

gfyoung commented Apr 27, 2016

@jreback : Travis and I both agree. LGTM otherwise.

@jreback jreback closed this in b8921ac Apr 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO CSV read_csv, to_csv Performance Memory or execution speed performance

2 participants