5

I use all over my code std::string objects, most of them stack allocated. I replaced default new/delete operators to check the average memory allocation size and I was surprised to see that most allocations come from std::string and, this is very important, they are between 12 and 32 bytes! To optimize this(large number of small strings allocated from the free storage) I came with a simple solution:

  • implement custom string class
  • add predefined buffer in it with fixed size, for example 64 bytes(I can adjust this size after profiling)
  • use predefined buffer when the length of the string is less than 64 bytes or allocate new buffer from the free storage for larger strings.

With this implementation I will add 64 bytes memory overhead to all my strings with length > 64 add also (64-lenth) overhead to strings shorter than 64 bytes, but I will prevent all the allocations for small strings.

Did you tried such approach? Is it resonable to add this memory overhead, most of the time on the stack to save allocation time/memory fragmentation?

Is this approach better than custom str::string allocators?

8
  • 2
    The rules boil down to: "1. Don't optimize early. 2. Don't optimize until you know that it's needed. 3. Even then, don't optimize until you know what’s needed, and where." -Herb Sutter Commented Dec 21, 2012 at 9:33
  • 3rd rule implies actual profiling with decent tools. Commented Dec 21, 2012 at 9:33
  • 4
    This is commonly known as Short String Optimization. This is implemented in the Dirkumware STL (used by VC++) while libstdc++ and libc++ (respectively coming with gcc and clang) have preferred the Copy On Write behavior: eagerly allocate on heap, but share underlying buffer as much as possible. Commented Dec 21, 2012 at 9:33
  • 1
    @MatthieuM. I don't think libc++ ever implemented CoW. It was developed from scratch as a C++11 library and CoW is not permitted by the C++11 string requirements. libc++ implements the small string optimization for strings up to 22 chars in length on 64-bit platforms. Commented Dec 21, 2012 at 9:51
  • 2
    @9dan: I think others have stated that already sooner than Herb Sutter, e.g. Michael Jackson: "The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.") Commented Dec 21, 2012 at 11:56

3 Answers 3

7

Is it reasonable? It depends. Unless you have found yourself actually running out of memory by using std::string or identified a real (as opposed to perceived) performance problem with continued allocation and deallocation, you're wasting your time doing this.

Don't get me wrong, you can (and I have) made great gains over standard allocators by using extra information unavailable to them. But, unless there's a real problem to be fixed, you're better off using the standard method and devoting your time elsewhere.

Sign up to request clarification or add additional context in comments.

3 Comments

Note: as bames53 noted, it seems that libc++ use SSO instead of COW, therefore it might be profitable to test it out.
@MatthieuM., yes, but only if it's an actual problem (which is the gist of my answer). Doubling the speed of an operation that takes a millionth of a second will not buy you much unless you do it many, many millions of times. There may well be more suitable (and profitable) areas for optimisation.
I completely agree, which is why I recommend swithing the implementation of the Standard library and measuring (because it could be as simple as recompiling) rather than fully reengineering the code and measuring. In case it proves not to be an issue, it's better not to have spent too much time on it :)
5

I think it's been done before, along the lines you mentioned, in folly: "fbstring is a drop-in replacement for std::string. The main benefit of fbstring is significantly increased performance on virtually all important primitives."

https://github.com/facebook/folly/blob/master/folly/docs/FBString.md

This should be pretty well optimized:

Storage strategies

Small strings (<= 23 chars) are stored in-situ without memory allocation.

Medium strings (24 - 255 chars) are stored in malloc-allocated memory and copied eagerly.

Large strings (> 255 chars) are stored in malloc-allocated memory and copied lazily.

Comments

2

Did you tried such approach?

No, but I did write my own allocator. It is an alternative I prefer.

Is it resonable to add this memory overhead, most of the time on the stack to save allocation time/memory fragmentation?

The definition of what is "reasonable" varies with your situation. Do you need to prevent memory fragmentation? Do you need to minimize number of allocations? Maximize speed at the cost of extra memory per string?

Do you simply want to optimize the std:: implementation as an exercise?

Do you have a target platform with performance (or memory) limitations?

Each of these questions will alter the answer (on whether the memory overhead is reasonable).

Is this approach better than custom str::string allocators?

Depends on your situation.

1 Comment

The String will be part of a game engine and there are many Strings declared as local variables in some functions that are called per frame. It is used on mobile devices(iOS/Android/Windows Phone/...)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.