1

I have a list of URLs of type

  • http://www.example.com/pk/ca,
  • http://www.example.com/pk,
  • http://www.example.com/anthingcangoeshere/pk, and
  • http://www.example.com/pkisnotnecessaryhere.

Now, I want to find out only those URLs that ends with /pk or /pk/ and don't have anything in between .com and /pk

6
  • 2
    Your question isn't very clear. Give many more examples of what you do want to match and what you don't want to match. Commented Apr 18, 2010 at 10:13
  • It's still not clear. Does the URL have to contain .com? Commented Apr 18, 2010 at 10:15
  • @Mark yes, it should contain .com Commented Apr 18, 2010 at 10:17
  • Maybe you mean “URLs that’s path is /pk or start with /pk/” or “URLs that’s first path segment is pk”? Commented Apr 18, 2010 at 10:17
  • This is a very useful page to learn regex: zytrax.com/tech/web/regex.htm Commented Apr 18, 2010 at 10:54

4 Answers 4

1

Your problem isn't fully defined so I can't give you an exact answer but this should be a start you can use:

^[^:]+://[^/]+\.com/pk/?$

These strings will match:

http://www.example.com/pk
http://www.example.com/pk/
https://www.example.com/pk

These strings won't match:

http://www.example.co.uk/pk
http://www.example.com/pk/ca
http://www.example.com/anthingcangoeshere/pk
http://www.example.com/pkisnotnecessaryhere
Sign up to request clarification or add additional context in comments.

Comments

1
String pattern = "^http://www.example.com/pk/?$";

Hope this helps.

Some details: if you don't add ^ to the beginning of the pattern, then foobarhttp://www.example.com/pk/ will be accepted too. If you don't add $ to the end of the pattern, then http://www.exampke.com/pk/foobar will be accepted too.

Comments

1

Directly translating your request "[...] URLs that ends with /pk or /pk/ and don't have anything in between .com and /pk", with the additional assumption that there shall always be a ".com", yields this regex:

If you use find():

\.com/pk/?$

If you use matches():

.*\.com/pk/?

Other answers given here give more restrictive patterns, allowing only URLs that are more close to your examples. Especially my pattern does not validate that the given string is a syntactically valid URL.

Comments

0
String pattern = "^https?://(www\.)?.+\\.com/pk/?$";

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.