35

How would I do something in c++ similar to the following code:

//Lang: Java
string.replaceAll("  ", " ");

This code-snippet would replace all multiple spaces in a string with a single space.

1

5 Answers 5

85
bool BothAreSpaces(char lhs, char rhs) { return (lhs == rhs) && (lhs == ' '); }

std::string::iterator new_end = std::unique(str.begin(), str.end(), BothAreSpaces);
str.erase(new_end, str.end());   

How this works. The std::unique has two forms. The first form goes through a range and removes adjacent duplicates. So the string "abbaaabbbb" becomes "abab". The second form, which I used, takes a predicate which should take two elements and return true if they should be considered duplicates. The function I wrote, BothAreSpaces, serves this purpose. It determines exactly what it's name implies, that both of it's parameters are spaces. So when combined with std::unique, duplicate adjacent spaces are removed.

Just like std::remove and remove_if, std::unique doesn't actually make the container smaller, it just moves elements at the end closer to the beginning. It returns an iterator to the new end of range so you can use that to call the erase function, which is a member function of the string class.

Breaking it down, the erase function takes two parameters, a begin and an end iterator for a range to erase. For it's first parameter I'm passing the return value of std::unique, because that's where I want to start erasing. For it's second parameter, I am passing the string's end iterator.

Sign up to request clarification or add additional context in comments.

13 Comments

Cool, never seen this before. +1
@Seth: Neither have I, it just came to me suddenly.
You could even do template<char Remove> bool BothAre(char lhs, char rhs) { return lhs == rhs && lhs == Remove; } then str.erase(std::unique(str.begin(), str.end(), BothAre<' '>), str.end()); to make it a tiny bit generic and usable for other characters too
@user386911 std::unique moves all consecutive duplicate characters in between the two iterators it receives to the end iterator, so that all the characters end up at the end of the string. It then returns the iterator to the beginning of all the characters it moved to the end of the string, and you pass that iterator to str.erase which takes two iterators and removes all the characters between them. tl;dr: all the duplicate spaces end up at the end of the string via unique, then erase removes them.
@Seth: "all the characters end up at the end of the string" <-- Common myth. Neither std::unique nor std::remove are required to do this, and I'm not aware of any implementation where they do. They just copy or move the non-duplicate elements from the end toward the front.
|
6

So, I tried a way with std::remove_if & lambda expressions - though it seems still in my eyes easier to follow than above code, it doesn't have that "wow neat, didn't realize you could do that" thing to it.. Anyways I still post it, if only for learning purposes:

bool prev(false);
char rem(' ');
auto iter = std::remove_if(str.begin(), str.end(), [&] (char c) -> bool {
    if (c == rem && prev) {
        return true;
    }
    prev = (c == rem);
    return false;
});
in.erase(iter, in.end());

EDIT realized that std::remove_if returns an iterator which can be used.. removed unnecessary code.

1 Comment

str.erase(iter, str.end()); instead of in.erase(iter, in.end());
4

A variant of Benjamin Lindley's answer that uses a lambda expression to make things cleaner:

std::string::iterator new_end = 
        std::unique(str.begin(), str.end(),
        [=](char lhs, char rhs){ return (lhs == rhs) && (lhs == ' '); }
        );
str.erase(new_end, str.end());

1 Comment

getting [=] is useless, just use []
2

Why not use a regular expression:

boost::regex_replace(str, boost::regex("[' ']{2,}"), " ");

4 Comments

C++11 also includes the regex library, which can be used if using boost is an issue.
There's no good reason, other than its just more fun to do algorithm jujitsu.
boost isnt part of the standard library is it?
@OwolabiEzekielTobiloba no it is not.
1

how about isspace(lhs) && isspace(rhs) to handle all types of whitespace

1 Comment

Please answer after you try it on your own and with a sample working code

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.