I'd probably look into something like tcmalloc, jemalloc, or some other malloc replacement. tcmalloc provides a fair bit of introspection - http://gperftools.googlecode.com/svn/trunk/doc/tcmalloc.html gives an overview of what it makes available. Take a look in the "Generic Tcmalloc Status" and "Memory Introspection" sections for some ideas that might be helpful if you choose to go that route. If you want to read about jemalloc, see http://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919 .
Besides that, there's also some OS dependent mechanisms for getting the info. On linux, /proc/self/statm should have everything you want. man proc should have the docs on the files there.
The malloc replacements that are instrumented for stats are probably easier to use, more portable, and more comprehensive than anything you might implement yourself.
newanddeleteyou also have to be aware of things like fragmentation which can lead to a big difference between the number of bytes currently allocated to objects and the number of bytes/pages held in use by the process, which is probably the bigger question.