Skip to main content
improved grammar
Source Link
Honest Abe
  • 8.8k
  • 5
  • 53
  • 66

I have a number of large (~100 Mb) files which I'm regularly processing. While I'm trying to delete unneeded data structures during processing, memory consumption is a bit too high. so, I was wondering isif there is a way to 'efficiently'efficiently manipulate large data, e.g.:

def read(self, filename):
    fc = read_100_mb_file(filename)
    self.process(fc)
def process(self, content):
    # do some processing of file content

isIs there a duplication of data structures? isn'tIsn't it more memory efficient to use a class-wide variableattribute like self.fc?

how toWhen should I use garbage-collect collection? I know about the gc module, but do I call it after iI del fc for example? does garbage collector called after a del statement at all? when should I use garbage collection?

update
p.s. 100 Mb is not a problem in itself. but float conversion, further processing add significantly more to both working set and virtual size (i'mI'm on windowsWindows).

I have a number of large (~100 Mb) files which I'm regularly processing. While I'm trying to delete unneeded data structures during processing, memory consumption is a bit too high. so, I was wondering is there a way to 'efficiently' manipulate large data, e.g.:

def read(self, filename):
    fc = read_100_mb_file(filename)
    self.process(fc)
def process(self, content):
    # do some processing of file content

is there a duplication of data structures? isn't it more memory efficient to use class-wide variable like self.fc?

how to garbage-collect? I know about gc module, but do I call it after i del fc for example? does garbage collector called after a del statement at all? when should I use garbage collection?

update
p.s. 100 Mb is not a problem in itself. but float conversion, further processing add significantly more to both working set and virtual size (i'm on windows).

I have a number of large (~100 Mb) files which I'm regularly processing. While I'm trying to delete unneeded data structures during processing, memory consumption is a bit too high. I was wondering if there is a way to efficiently manipulate large data, e.g.:

def read(self, filename):
    fc = read_100_mb_file(filename)
    self.process(fc)
def process(self, content):
    # do some processing of file content

Is there a duplication of data structures? Isn't it more memory efficient to use a class-wide attribute like self.fc?

When should I use garbage collection? I know about the gc module, but do I call it after I del fc for example?

update
p.s. 100 Mb is not a problem in itself. but float conversion, further processing add significantly more to both working set and virtual size (I'm on Windows).

edited tags
Link
Quinn Taylor
  • 44.8k
  • 16
  • 116
  • 135
edited tags
Link
Scott Dorman
  • 42.6k
  • 12
  • 82
  • 112
change title and a tag to a more appropriate one
Link
SilentGhost
  • 322.1k
  • 67
  • 312
  • 294
Loading
update
Source Link
SilentGhost
  • 322.1k
  • 67
  • 312
  • 294
Loading
Source Link
SilentGhost
  • 322.1k
  • 67
  • 312
  • 294
Loading