42

I am looking for the fastest way to remove duplicate values in a string separated by commas.

So my string looks like this;

$str = 'one,two,one,five,seven,bag,tea';

I can do it be exploding the string to values and then compare, but I think it will be slow. what about preg_replace() will it be faster? Any one did it using this function?

1
  • what is the estimated size of this data? Commented Apr 10, 2010 at 10:41

2 Answers 2

151

The shortest code would be:

$str = implode(',',array_unique(explode(',', $str)));

If it is the fastest... I don't know, it is probably faster then looping explicitly.

Reference: implode, array_unique, explode

Sign up to request clarification or add additional context in comments.

10 Comments

Thank you @Felix, that is excellent, that is what I needed, the max values in a string are 50.
@Adnan: With 50 values this should not be much of a problem :)
Works if multiple of 2. If not, fails.
@DanielOmine: not sure what exactly you mean. It doesn't matter how many elements there are in the array.
@DanielOmine: "but also removes duplicated folder names" well, but that's exactly what this question is asking about. So yeah, this is not a solution for your use case. If you want to collapse consecutive /, you can just do preg_replace('#//+#', '/', $str).
|
1

Dealing with: $string = 'one,two,one,five,seven,bag,tea';

If you are generating the string at any point "up script", then you should be eliminating duplicates as they occur.

Let's say you are using concatenation to generate your string like:

$string='';
foreach($data as $value){
    $string.=(strlen($string)?',':'').some_func($value);
}

...then you would need to extract unique values from $string based on the delimiter (comma), then re-implode with the delimiter.


I suggest that you design a more direct method and deny duplicates inside of the initial foreach loop, like this:

foreach($data as $value){
    $return_value=some_func($value);  // cache the returned value so you don't call the function twice
    $array[$return_value]=$return_value;  // store the return value in a temporary array using the function's return value as both the key and value in the array.
}
$string=implode(',',$array);  // clean: no duplicates, no trailing commas

This works because duplicate values are never permitted to exist. All subsequent occurrences will be used to overwrite the earlier occurrence. This function-less filter works because arrays may not have two identical keys in the same array(level).

Alternatively, you can avoid "overwriting" array data in the loop, by calling if(!isset($array[$return_value])){$array[$return_value]=$return_value;} but the difference means calling the isset() function on every iteration. The advantage of using these associative key assignments is that the process avoids using in_array() which is slower than isset().

All that said, if you are extracting a column of data from a 2-dimensional array like:

$string='';
foreach($data as $value){
    $string.=(strlen($string)?',':'').$value['word'];
}

Then you could leverage the magic of array_column() without a loop like this:

echo implode(',',array_column($str,'word','word'));

And finally, for those interested in micro-optimization, I'll note that the single call of array_unique() is actually slower than a few two-function methods. Read here for more details.

The bottomline is, there are many ways to perform this task. explode->unique->implode may be the most concise method in some cases if you aren't generating the delimited string, but it is not likely to be the most direct or fastest method. Choose for yourself what is best for your task.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.