I've gone back to the drawing board with your pattern.
I personally never use named capture groups in PHP because they only bloat the pattern and the output array. If you need named keys, just assign them from the matches array.
When using pattern modifier i, you don't need to list upper and lower case letters in your character class.
There is no benefit to your pattern by inserting \K to restart the fullstring match -- just omit those.
Use non-capturing groups and the zero or one quantifier to make subsequent capture groups optional.
Instead of using .*? to lazily match the LINE data, match non-whitespace characters delimited by one or more whitespaces -- this will improve pattern performance by reducing the amount of backtracking that is necessary.
Instead of loosely validating a predictable LINE_DATA substring pattern with [\d\s,\(\)]+?, explicitly validate each delimited segment of that group. This improves the validation strength of your pattern.
Admittedly, my linebreaks, subpattern tabbing, and inline commenting is excessively long and wide -- certainly violating PSR-12 guidelines. This is a sacrifice that I am making to explain in great detail how the pattern works. Few development teams are 100% comprised of regex gurus, so it is important that you aim to inform the weakest regex user who might read your script. I often include a link to a regex101.com demo with a battery of test cases in my professional projects because I want my team to be very sure about how it works and how extensively it was tested.
Working Code (Demo) Regex101 Demo
$regex = <<<REGEX
~
^ # start of string anchor
(CONF|ESD|TRACKING) # start capture group 1 KEY, three literal words
(?: # start non-capturing group 1
\h*[:;'\h]\h* # require a listed punctuation or space with optional leading or trailing spaces
(\S+(?:\h+\S+)*?) # start capture group 2 LINE, require one or more non-whitespace characters then lazily match zero or more repetitions of whitespace then non-whitespace substrings
(?: # start non-capturing group 2
\h*L\h*[:;'\h]\h* # require literal L then a listed punctuation or space with optional leading or trailing spaces
( # start capture group 3 LINE_DATA
(?:\d+(?:\(\d+\))?) # require a number optionally followed by another number in parentheses
(?:\h*,\h*\d+(?:\(\d+\))?)* # optionally match zero or more repetitions of the previous expression if separated by an optionally space-padded comma
) # end capture group 3 and make it optional
)? # end non-capturing group 2
(?: # start non-capturing group 3
\h* # match zero or more whitespaces
( # start capture group 4 INITIALS
\*[.a-z]+ # match literal asterisk, then one or more dots and letters
) # end capture group 4
)? # end non-capturing group 3 and make it optional
)? # end non-capturing group 2 and make it optional
\h* # allow trailing whitespaces
$ # end of string anchor
~ix
REGEX;
$tests = [
"esd hedf L:1,2,3 *sm ",
"CONF: FEDEX 12345 L: 12(2),2(9),32 *SM",
"Tracking *cool",
"ESD: 12/12/92 L: ",
"tRacking' my data L: 1,2,3(4) ",
"conf something *asterisk",
"tracking",
"ConF''' something '' L: 6",
"esd test 24(7)",
];
foreach ($tests as $i => $test) {
if (preg_match($regex, $test, $m, PREG_UNMATCHED_AS_NULL)) {
var_export([
"test index" => $i,
"KEY" => $m[1],
"LINE" => $m[2] ?? null,
"LINE_DATA" => $m[3] ?? null,
"INITIALS" => $m[4] ?? null
]);
echo "\n";
}
}
Output:
array (
'test index' => 0,
'KEY' => 'esd',
'LINE' => 'hedf',
'LINE_DATA' => '1,2,3',
'INITIALS' => '*sm',
)
array (
'test index' => 1,
'KEY' => 'CONF',
'LINE' => 'FEDEX 12345',
'LINE_DATA' => '12(2),2(9),32',
'INITIALS' => '*SM',
)
array (
'test index' => 2,
'KEY' => 'Tracking',
'LINE' => '*cool',
'LINE_DATA' => NULL,
'INITIALS' => NULL,
)
array (
'test index' => 3,
'KEY' => 'ESD',
'LINE' => '12/12/92 L:',
'LINE_DATA' => NULL,
'INITIALS' => NULL,
)
array (
'test index' => 4,
'KEY' => 'tRacking',
'LINE' => 'my data',
'LINE_DATA' => '1,2,3(4)',
'INITIALS' => NULL,
)
array (
'test index' => 5,
'KEY' => 'conf',
'LINE' => 'something',
'LINE_DATA' => NULL,
'INITIALS' => '*asterisk',
)
array (
'test index' => 6,
'KEY' => 'tracking',
'LINE' => NULL,
'LINE_DATA' => NULL,
'INITIALS' => NULL,
)
array (
'test index' => 7,
'KEY' => 'ConF',
'LINE' => '\'\' something \'\'',
'LINE_DATA' => '6',
'INITIALS' => NULL,
)
array (
'test index' => 8,
'KEY' => 'esd',
'LINE' => 'test 24(7)',
'LINE_DATA' => NULL,
'INITIALS' => NULL,
)
\Krestarting the fullstring match. If you don't have any need for the fullstring match, then don't access the[0]element in the generated array of matches. You only need to use\Kif you want to "forget/release" previously matched characters -- which you have no need for here. \$\endgroup\$[$1, ?[$2, ?[$3, ?[$4]]]or?$1, ?$2, ?$3, ?$4? In other words, can the 3rd capture group be satisfied without the 2nd? Can a valid/qualifying string contain only$1and$4? I'm trying to determine if I have maintained your pattern logic with regex101.com/r/Q8BvaP/1 \$\endgroup\$\K. Makes sense. I put it in because it made the result 'cleaner' on the regex101 page. But won't make much difference in actual code. So --$3and$4can only exist if$2does. We can have following permuatations$1 $2$1 $2 $3$1 $2 $3 $4$1 $2 $4\$\endgroup\$