0

The following simple python code:

class Node:
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
if __name__ == '__main__':
    nodes = []
    for i in xrange(1, 7 * 1000 * 1000):
        if i % 1000 == 0:
            print i
        nodes.append(Node())

takes gigabytes of memory; Which I think is irrational. Is that normal in python?

How could I fix that?(in my original code, I have about 7 million objects each with 10 fields and that takes 8 gigabytes of RAM)

3
  • Can we see the original code? Commented Jan 7, 2015 at 23:43
  • So each object approximately takes 1K memory. What kind of fields are they? Commented Jan 7, 2015 at 23:50
  • I Just simplified that. there are 10 other integer fields which I initialize all of them with zero. I think that's not much important. the same code in C should take about (4 * 7000000 bytes < 30MB) Commented Jan 7, 2015 at 23:50

2 Answers 2

3

If you have fixed number of fields then you can use __slots__ to save quite a lot of memory. Note that __slots__ do have some limitations, so make sure your read the Notes on using __slots__ carefully before choosing to use them in your application:

>>> import sys
>>> class Node(object):
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
...         
>>> n = Node()
>>> sys.getsizeof(n)
64
>>> class Node(object):
    __slots__ = ()
    NumberOfNodes = 0
    def __init__(self):
        Node.NumberOfNodes += 1
...         
>>> n = Node()
>>> sys.getsizeof(n)
16
Sign up to request clarification or add additional context in comments.

7 Comments

thanks very much, I tried that and memory usage became about 1.5 GBs(originally 8GBs). but as I said in the comments, the equivalent C program should take about 300MBs. it means that python takes 5 times more memory.
The size of the normal class should really include the size of its __dict__ as well. eg. sys.getsizeof(n) + sys.getsizeof(vars(n)). getsizeof doesn't count size of the dict as its a separate object.
@Farzam Well in Python everything is object, including classes, integers etc. Though that's not the case with C/C++.
@AshwiniChaudhary I know, but even if you use class in C++, there is not much noticeable difference(compared to the difference with python). So what does python store in these extra memories?
@Dunes Oh! yes, and size of __slots__ in case of the second one.
|
2

Python is an inherently memory heavy programming language. There are some ways you can get around this. __slots__ is one way. Another, more extreme approach is to use numpy to store your data. You can use numpy to create a structured array or record -- a complex data type that uses minimal memory, but suffers a substantial loss of functionality compared to a normal python class. That is, you are working with the numpy array class, rather than your own class -- you cannot define your own methods on your array.

import numpy as np

# data type for a record with three 32-bit ints called x, y and z
dtype = [(name, np.int32) for name in 'xyz']
arr = np.zeros(1000, dtype=dtype)
# access member of x of a record
arr[0]['x'] = 1 # name based access
# or
assert arr[0][0] == 1 # index based access
# accessing all x members of records in array
assert arr['x'].sum() == 1
# size of array used to store elements in memory
assert arr.nbytes == 12000 # 1000 elements * 3 members * 4 bytes per int

See more here.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.