0

I want to let the user type in tags: windows linux "mac os x"

and then split them up by white space but also recognizing "mac os x" as a whole word.

Is this possible to combine the explode function with other functions for this?

2
  • 1
    Tell the user to use mac-os-x :) Commented Dec 19, 2009 at 10:23
  • I would create a dictionary file of tags and use the levenshtein function to find the best match. Commented Dec 20, 2009 at 4:46

6 Answers 6

8

I would ask the user to enter the tags commas separated and explode with comma delimiter:

$string = "windows, linux, mac os x";
$pieces = explode(',', $string);

This is they way most tag system work anyway.

otherwise you'll need to construct a parser because explode cannot cope with what you want. Regex is an overkill in my opinion.

Sign up to request clarification or add additional context in comments.

Comments

2

As long as there can't be quotes within quotes (eg. "foo\"bar" isn't allowed), you can do this with a regular expression. Otherwise you need a full parser.

This should do:

function split_words($input) {
  $matches = array();
  if (preg_match_all('/("([^"]+)")|(\w+)/', $input, $reg)) {
    for ($ii=0,$cc=count($reg[0]); $ii < $cc; ++$ii) {
      $matches[] = $reg[2][$ii] ? $reg[2][$ii] : $reg[3][$ii];
    }
  }
  return $matches;
}

Usage:

$input = 'windows linux "mac os x"';
var_dump(split_words($input));

Comments

2

Either have the user separate their tag values with commas as Elzo Valugi suggested, or improve on your UI so that users enter one tag at a time (similar to Google Wave or Wordpress's tagging UI). I suggest the later.

If you really want to stick with your proposed entry format (which I don't suggest), you could maintain a list of multi-word tags (those that aren't supposed to be split). Compare the combined tag string provided by the user against this list and make sure that you don't split those terms. If you're set on sticking to this method, I could go into the details more, but I don't think it's a good idea as the entry format itself is flawed.

2 Comments

The underscore _ could also be used in tags in place of spaces, then a simple str_replace done.
That's not really a legitimate request to make of a user
0

You could do a regex. I'm not the best at writing them, but someone else here should be able to match the 'words' breaking them on spaces that aren't in quotes.

Comments

0

When the user is entering the string "mac os x" you can automatically detect the white space and change to string to "mac-os-x" then you can still explode this way:

$os = "metasys solaris mac-os-x";
$strings = explode(' ', $os);

You can do this using the replace function.

2 Comments

This implies that the user is entering tags one at a time, in which case, it's possible to keep the tags separate from the beginning. Also making the user convert spaces to "-" isn't very usable.
Even if the user will be entering the tags one at a time you aren't going to be sending it to the server immediately. And by the way i talking about entering a tag and just hitting the enter button. It's user friendly and it it is equivalent to spaces in between. Users just have to hit ENTER in my case and SPACE BAR in the other case.
0

You are parsing a delimited string -- that delimiter in this case is a space.

PHP has str_getcsv() which will protect substrings wrapped in a particular character -- the default wrapping character is a double quote (how convenient for you). If your input string was comma-delimited, you could omit the 2nd parameter because that is the default value.

The double quotes will be stripped from the value in the result array.

Code: (Demo)

$string = 'windows linux "mac os x"';

var_export(
    str_getcsv($string, ' ')
);

Output:

array (
  0 => 'windows',
  1 => 'linux',
  2 => 'mac os x',
)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.