This question is seeking support for a task comprised from 3 separate procedures.
How to split a string on spaces to generate an array of words? (The OP has a suboptimal, yet working solution for this part.)
- Because the pattern is only seeking out "spaces" between words, the pattern could be changed to
/ /
. This eliminates the check for additional white-space characters beyond just the space.
- Better/Faster than a regex-based solutions would be to split the string using string functions.
explode(' ',$descr)
would be the most popular and intuitive function call.
str_word_count($descr,1)
as Ravi Hirani pointed out will also work, but is less intuitive.
A major benefit to this function is that it seamlessly omits punctuation --
for instance, if the the OP's sample string had a period at the end, this function would omit it from the array!
Furthermore, it is important to note what is considered a "word":
For the purpose of this function, 'word' is defined as a locale dependent string containing alphabetic characters, which also may contain, but not start with "'" and "-" characters.
How to generate an indexed array with keys starting from 1?
- Bind a generated "keys" array (from 1) to a "values" array:
$words=explode(' ',$descr); array_combine(range(1,count($words)),$words)
- Add a temporary value to the front of the indexed array (
[0]
), then remove the element with a function that preserves the array keys.
array_unshift($descr,''); unset($descr[0]);
array_unshift($descr,''); $descr=array_slice($descr,1,NULL,true);
- How to convert a string to all lowercase? (it was hard to find a duplicate -- this a RTM question)
lcfirst($descr)
will work in the OP's test case because only the first letter of the first word is capitalized.
strtolower($descr)
is a more reliable choice as it changes whole strings to lowercase.
mb_strtolower($descr)
if character encoding is relevant.
- Note:
ucwords()
exists, but lcwords()
does not.
There are so many paths to a correct result for this question. How do you determine which is the "best" one? Top priority should be Accuracy. Next should be Efficiency/Directness. Followed by some consideration for Readability. Code Brevity is a matter of personal choice and can clash with Readability.
With these considerations in mind, I would recommend these two methods:
Method #1: (one-liner, 3-functions, no new variables)
$descr="Hello this is a test string";
var_export(array_slice(explode(' ',' '.strtolower($descr)),1,null,true));
Method #2: (two-liner, 3-functions, one new variable)
$descr="Hello this is a test string";
$array=explode(' ',' '.strtolower($descr));
unset($array[0]);
var_export($array);
Method #2 should perform faster than #1 because unset()
is a "lighter" function than array_slice()
.
Explanation for #1 : Convert the full input string to lowercase and prepend $descr
with a blank space. The blank space will cause explode()
to generate an extra empty element at the start of the output array. array_slice()
will output generated array starting from the first element (omitting the unwanted first element).
Explanation for #2 : The same as #1 except it purges the first element from generated array using unset()
. While this is faster, it must be written on its own line.
Output from either of my methods:
array (
1 => 'hello',
2 => 'this',
3 => 'is',
4 => 'a',
5 => 'test',
6 => 'string',
)
Related / Near-duplicate:
strtolower()
on every iteration of$myarray
-- this is not optimal. Please have a look at my answer and perhaps provide one or two more sample strings in your question so that we can see any differences in string structure (capitalization and punctuation) because these points will impact the correct method for your task.