8

I have a function that constructs a std::string from a const char* with two numbers, passed as parameters, appended to the end of it.

std::string makeName(const char* name, uint16_t num1, uint16_t num2) {

    std::string new_name(name);
    new_name.reserve(new_name.length()+5);

    new_name += ":";
    new_name += boost::lexical_cast<std::string>(num1);
    new_name += ":";
    new_name += boost::lexical_cast<std::string>(num2);

    return new_name;
}

This function gets called thousands of times to create unique names for small objects allocated on the heap.

Object* object1= new Object(makeName("Object", i, j)); // i and j are simply loop indices

I have discovered using valgrind's massif tool that the calls to makeName allocates a lot of memory since it gets called so many times.

87.96% (1,628,746,377B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.

->29.61% (548,226,178B) 0xAE383B7: std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)

| ->26.39% (488,635,166B) 0xAE38F79: std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)

| | ->26.39% (488,633,246B) 0xAE39012: std::string::reserve(unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19)

| | | ->15.51% (287,292,096B) 0x119A80FD: makeName(char const*, unsigned short, unsigned short) (Object.cpp:110)

| | | | ->15.51% (287,292,096B) in 42 places, all below massif's threshold (01.00%)

My question is, how can I minimize these allocations to help reduce the overall total amount of memory my program uses?

EDIT: I also want to note that as a program requirement I cannot use c++11 features.

10
  • 2
    Try using a stringstream. sstream ss; ss << name << " : " << num1 << " : " << num2; return ss.str(); Commented Feb 5, 2015 at 15:17
  • Does the "name" have to be a std::string ? If your number of objects is within numeric limits, how about assigning a unique int to it? Commented Feb 5, 2015 at 15:18
  • @NathanOliver A good first step: but if the code in question is a serious bottleneck, going to stringstream isn't the answer. Commented Feb 5, 2015 at 15:21
  • @akashPradhan - that is a good suggestion, however the object's name is typically going to be something other than "Object" - it is going to be a descriptive name of the object Commented Feb 5, 2015 at 15:22
  • 1
    Oh, and how many objects are we talking about? The name "Object:22:979" will take up 14 bytes of memory, plus 12-24 bytes for the pointers to track it, and another 4-16 bytes of allocation overhead by the heap. If that is large compared to your object, and you have many objects... that is overhead. And if the objects are otherwise small, the high percentage might be because that is what you asked for? Commented Feb 5, 2015 at 15:24

3 Answers 3

3

Only a DIY custom conversion beats using sprintf in such case.

So I would use sprintf and MEASURE.

Only if that wasn't good enough would I implement my own integer-to-string (knowing from numerous cases that it will certainly be somewhat faster, but not enough to justify starting with that).


Example. Instead of the current high level code

std::string makeName(const char* name, uint16_t num1, uint16_t num2) {

    std::string new_name(name);
    new_name.reserve(new_name.length()+5);

    new_name += ":";
    new_name += boost::lexical_cast<std::string>(num1);
    new_name += ":";
    new_name += boost::lexical_cast<std::string>(num2);

    return new_name;
}

just do

auto makeName( const char* const name, const uint16_t num1, const uint16_t num2 )
    -> std::string
{
    std::string result( strlen( name ) + 25, '\0' );    // 25 is large enough.
    int const n = sprintf( &result[0], "%s:%d:%d", name, num1, num2 );
    result.resize( n );
    return result;
}

Disclaimer: code not touched by compiler's hands.

Sign up to request clarification or add additional context in comments.

8 Comments

Depending on the compiler, Boost.Spirit.Karma handily beats sprintf. There’s also the FastFormat library, but I’ve never used that.
@KonradRudolph: Thanks for reference to FastFormat. So I learned a new thing also today. Not yet dead! :)
Is std::string guaranteed contiguous now? I thought &v[0] was only safe with vector
@KennyOstrom: std::string's buffer has been guaranteed contiguous since the Lillehammer meeting in 2005, I think it was. That's about a decade now. ;-)
@Mgetz: There are a number of issues with your statement and it really needs a longer discussion (e.g. there were changes from C++03 to C++11). But here it suffices to note that the code above does not rely on null-termination of the ordinary buffer contents. The initialization to zero is just because it goes against good programming practice to introduce needless inefficiency, even micro-inefficiency, but except for that the buffer could just as well have been initialized to e.g. # characters.
|
2

"My question is, how can I minimize these allocations"

It occurs to me that you have a reason for those names. Can you compute them at the time of need, rather than always generating the name in the constructor? That would be the best improvement - don't do it at all until needed.

If you happen to already have a virtual base, and the class type determines the string, that makes it really easy. Otherwise, an enumerated type could replace the string, and you have a lookup table.

Object* object1= new Object(i, j));
std::string Object::getName(){ compute here }

If this doesn't help, then you actually do need the string for each and every object, and you can only get a small speedup by optimizing that function. I noticed that you construct the string at one size then grow it afterwards. Also you could work with a char buffer then assign it to the member string (in the constructor).

1 Comment

Yes, unfortunately I cannot compute the names at the time of need, I have to store them in the object so that I can reference the object by its unique name.
1

Yes, your code is doing a lot of allocations. Analyzing the allocations:

std::string new_name(name); // 1
new_name.reserve(new_name.length()+5); // 2

new_name += ":";
new_name += boost::lexical_cast<std::string>(num1); // possibly 4 (boost + operator+=)
new_name += ":"; // possibly 5
new_name += boost::lexical_cast<std::string>(num2); // possibly 7

'possibly' because it depends on the characters needed by the numbers (the higher, the more).

If you're really concerned about memory allocations, asprintf(not standard though) or your version (based on the return value of s(n)printf) is likely to be the best choice:

std::string makeName(const char* name, uint16_t num1, uint16_t num2)
{
     char *ptr = nullptr; 
     int size  = asprintf( &ptr, "%s:%u:%u", name, num1, num2);
  return std::string(ptr, size); // to avoid copying the chars
}  

Note: As @Cheersandhth.-Alf pointed out, in case std::string failed to allocate memory, ptr would be ptr is leaked. The best way to solve this would involve using std::unique_ptr but I'll leave you to work it out to your needs.

if you don't want to use asprintf, you can get a similar behavior with std::snprintf

std::string makeName(const char* name, uint16_t num1, uint16_t num2)
{
    int length = std::snprintf(nullptr, 0, "%s:%u:%u", name, num1, num2);

    if (length > 0 )
    {
        std::string new_name(length + 1, '\0');
        std::snprintf(&new_name[0], new_name.length(), "%s:%u:%u", name, num1, num2);

       return new_name;
    } 
    // else handle failure
} 

The difference with your version (I didn't use boost::lexical_cast but std::to_string) is very big: ran 500 times the first version uses 72,890 bytes while the second only 23,890! (measured with valgrind memcheck)

5 Comments

–1 The code shown does not work and would not work if "corrected". The basic idea of this code, storing a pointer to a new buffer in &(s.data()), is not even meaningful. Add to that a non-standard function.
@Cheersandhth.-Alf You're right, I was realizing that too. Fixing.
By the way, the probably easiest way to deal with the memory leak is to use a std::unique_ptr with a custom destruction function. The custom destruction function should use free to deallocate the memory.
Well, since the uints can have up to 6 characters you could only calculate the string length + 12. But, AFAIK, that's the best way to allocate the right space for s(n)printf. I'll put a note about the memory leak to let the OP reason about it so he can fulfill his requirements (he can't use C++11).
Note that there's no way to specify a buffer to std::string, except moving a buffer from a temporary. The constructor taking a pointer does not take ownership of the argument buffer. It copies.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.