777

I want to know the memory usage of my Python application and specifically want to know what code blocks/portions or objects are consuming most memory. Google search shows a commercial one is Python Memory Validator (Windows only).

And open source ones are PySizer and Heapy.

I haven't tried anyone, so I wanted to know which one is the best considering:

  1. Gives most details.

  2. I have to do least or no changes to my code.

6
  • 4
    For finding the sources of leaks I recommend objgraph. Commented Nov 15, 2012 at 10:23
  • 14
    @MikeiLL There is a place for questions like these: Software Recommendations Commented Feb 5, 2015 at 19:12
  • 7
    This is happening often enough that we should be able to migrate one question to another forum instead. Commented Apr 11, 2016 at 14:53
  • 7
    I recommend pympler Commented Jun 20, 2017 at 13:57
  • 3
    Check out memray Commented Apr 27, 2022 at 12:36

8 Answers 8

514

My module memory_profiler is capable of printing a line-by-line report of memory usage and works on Unix and Windows (needs psutil on this last one). Output is not very detailed but the goal is to give you an overview of where the code is consuming more memory, not an exhaustive analysis on allocated objects.

After decorating your function with @profile and running your code with the -m memory_profiler flag it will print a line-by-line report like this:

Line #    Mem usage  Increment   Line Contents
==============================================
     3                           @profile
     4      5.97 MB    0.00 MB   def my_func():
     5     13.61 MB    7.64 MB       a = [1] * (10 ** 6)
     6    166.20 MB  152.59 MB       b = [2] * (2 * 10 ** 7)
     7     13.61 MB -152.59 MB       del b
     8     13.61 MB    0.00 MB       return a
Sign up to request clarification or add additional context in comments.

18 Comments

For my usecase - a simple image manipulation script, not a complex system, which happened to leave some cursors open - this was the best solution. Very simple to drop in and figure out what's going on, with minimal gunk added to your code. Perfect for quick fixes and probably great for other applications too.
I find memory_profiler to be really simple and easy to use. I want to do profiling per line and not per object. Thanks for writing.
It identifies loops only implicitly when it tries to report the line-by-line amount and it finds duplicated lines. In that case it will just take the max of all iterations.
I have tried memory_profiler but think it is not a good choice. It makes the program execution incredibly slow (approximately in my case as 30 times as slow).
This tool is no longer maintained.
|
315

guppy3 is quite simple to use. At some point in your code, you have to write the following:

from guppy import hpy
h = hpy()
print(h.heap())

This gives you some output like this:

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)

You can also find out from where objects are referenced and get statistics about that, but somehow the docs on that are a bit sparse.

There is a graphical browser as well, written in Tk.

For Python 2.x, use Heapy.

14 Comments

If you're on Python 2.7 you may need the trunk version of it: sourceforge.net/tracker/…, pip install https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy
The heapy docs are... not good. But I found this blog post very helpful for getting started: smira.ru/wp-content/uploads/2011/08/heapy.html
Note, heapy doesn't include memory allocated in python extensions. If anybody has worked out a mechanism to get heapy to include boost::python objects, it would be nice to see some examples!
As of 2014-07-06, guppy does not support Python 3.
There is a fork of guppy that supports Python 3 called guppy3.
|
84

I recommend Dowser. It is very easy to setup, and you need zero changes to your code. You can view counts of objects of each type through time, view list of live objects, view references to live objects, all from the simple web interface.

# memdebug.py

import cherrypy
import dowser

def start(port):
    cherrypy.tree.mount(dowser.Root())
    cherrypy.config.update({
        'environment': 'embedded',
        'server.socket_port': port
    })
    cherrypy.server.quickstart()
    cherrypy.engine.start(blocking=False)

You import memdebug, and call memdebug.start. That's all.

I haven't tried PySizer or Heapy. I would appreciate others' reviews.

UPDATE

The above code is for CherryPy 2.X, CherryPy 3.X the server.quickstart method has been removed and engine.start does not take the blocking flag. So if you are using CherryPy 3.X

# memdebug.py

import cherrypy
import dowser

def start(port):
    cherrypy.tree.mount(dowser.Root())
    cherrypy.config.update({
        'environment': 'embedded',
        'server.socket_port': port
    })
    cherrypy.engine.start()

10 Comments

but is it only for cherrypy, how to use it with a sinple script?
It is not for CherryPy. Think of CherryPy as a GUI toolkit.
There is a generic WSGI port of Dowser called Dozer, which you can use with other web servers as well: pypi.python.org/pypi/Dozer
cherrypy 3.1 removed cherrypy.server.quickstart(), so just use cherrypy.engine.start()
this doesn't work in python 3. I get an obvious StringIO error.
|
70

Consider the objgraph library (see this blog post for an example use case).

2 Comments

objgraph helped me solve a memory leak issue I was facing today. objgraph.show_growth() was particularly useful
I, too, found objgraph really useful. You can do things like objgraph.by_type('dict') to understand where all of those unexpected dict objects are coming from.
19

Muppy is (yet another) Memory Usage Profiler for Python. The focus of this toolset is laid on the identification of memory leaks.

Muppy tries to help developers to identity memory leaks of Python applications. It enables the tracking of memory usage during runtime and the identification of objects which are leaking. Additionally, tools are provided which allow to locate the source of not released objects.

Comments

16

I'm developing a memory profiler for Python called memprof:

http://jmdana.github.io/memprof/

It allows you to log and plot the memory usage of your variables during the execution of the decorated methods. You just have to import the library using:

from memprof import memprof

And decorate your method using:

@memprof

This is an example on how the plots look like:

enter image description here

The project is hosted in GitHub:

https://github.com/jmdana/memprof

2 Comments

How do I use it? What is a,b,c?
@tommy.carstensen a, b and c are the names of the variables. You can find the documentation at github.com/jmdana/memprof. If you have any questions please feel free to submit an issue in github or send an email to the mailing list that can be found in the documentation.
12

I found meliae to be much more functional than Heapy or PySizer. If you happen to be running a wsgi webapp, then Dozer is a nice middleware wrapper of Dowser

Comments

7

Try also the pytracemalloc project which provides the memory usage per Python line number.

EDIT (2014/04): It now has a Qt GUI to analyze snapshots.

1 Comment

tracemalloc is now part of the python standard library. See docs.python.org/3/library/tracemalloc.html

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.