Return to Revisions

1 of 2

answered Jul 14, 2018 at 21:39

Interesting problem!

I am sure that someone can and will come up with a better solution, but here are my initial thoughts: Your biggest bottleneck will be the continual reallocation of memory. Whenever you do a replace, you are creating a new string, copying the content from src to destination and do a modification on the dest (result) string... well more or less, the details are besides the point. The problem is that memory allocation and deallocation will be >90% bottleneck. Its is very expensive so avoid it, if you can, or minimize it.

The strategy I am going to propose is based on minimizing the result string allocation to just once.

1.) Build a map of indices indicating start/end pos as well as the replacing string of a block that needs to be changed. e.g. "this is just BEL test and another CHN test" -> [[13,16,Belgium], [34,37,China]]

2.) The resulting size is sum of all gap sizes + sum of replacement size, in other words:

13 chars - from index 1, the beginning, to 13
 7 chars - from index 13 to 16 ("BEL" -> "Belgium")
19 chars - from index 16 to 34
 5 chars - from index 37 to 40 ("CHN" -> "China")
 3 chars - from index 40 to 42, the end

Your result string will be 13+7+19+5+3 = 47 chars long, you allocate this once.

3.) Reiterate and construct your result string accordingly

I think this solution should suffice for now, until someone pulls out the standard way to approach this problem!

answered Jul 14, 2018 at 21:39

Aphton