0

I know this question is duplicated many time but I couldn't find the right answer.

If I have a string of urls such as:

"www.google.comwww.yahoo.comwww.ebay.com" (Assuming no spaces between links)

I want to extract each like individually and put them in array. I tried to use regex like:

    String[] sp= parts.split("\\www");
    System.out.println(parts[0]);

This didn't work! Any hint is appreciated

2
  • 5
    This is not generically possible, since that entire string is a technically valid domain name. Commented Apr 28, 2014 at 17:10
  • is parts an array? You are putting the split version of parts in an array called sp, yet you are printing out from parts. Commented Apr 28, 2014 at 17:18

2 Answers 2

2

Regex

(www\.((?!www\.).)*)

Regular expression visualization

Debuggex Demo


Description

Options: case insensitive

Match the regular expression below and capture its match into backreference number 1 «(www\.((?!www\.).)*)»
    Match the characters “www” literally «www»
    Match the character “.” literally «\.»
    Match the regular expression below and capture its match into backreference number 2 «((?!www\.).)*»
        Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
        Note: You repeated the capturing group itself.  The group will capture only the last iteration.  Put a capturing group around the repeated group to capture all iterations. «*»
        Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!www\.)»
        Match the characters “www” literally «www»
        Match the character “.” literally «\.»
    Match any single character that is not a line break character «.»

Java

try {
    String subjectString = "www.google.comwww.yahoo.comwww.ebay.com";
    String[] splitArray = subjectString.split("(?i)(www\\.((?!www\\.).)*)");
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
}
Sign up to request clarification or add additional context in comments.

Comments

2

You can also just use basic string methods to break up the comwww into com www and then simply split on the spaces:

    String urlString = "www.google.comwww.yahoo.comwww.ebay.com";
    String[] urlArray = urlString.replaceAll(".comwww.", ".com www.").split(" ");
    System.out.println(Arrays.toString(urlArray)); // [www.google.com, www.yahoo.com, www.ebay.com]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.