1

Here I have the following function to convert the string into a slug to make SEO friendly URL.

stringToSlug: function (title) {
   return title.toLowerCase().trim()
       .replace(/\s+/g, '-')           // Replace spaces with -
       .replace(/&/g, '-and-')         // Replace & with 'and'
       .replace(/[^\w\-]+/g, '')       // Remove all non-word chars
       .replace(/\-\-+/g, '-')         // Replace multiple - with single -
    }

var title1 = 'Maoist Centre adamant on PM or party chair’s post';
function stringToSlug1 (title) {
  return title.toLowerCase().trim()
    .replace(/\s+/g, '-')           // Replace spaces with -
    .replace(/&/g, '-and-')         // Replace & with 'and'
    .replace(/[^\w\-]+/g, '')       // Remove all non-word chars
    .replace(/\-\-+/g, '-')         // Replace multiple - with single -
 }
console.log(stringToSlug1(title1));

var title2 = 'घर-घरमा ग्यास पाइपः कार्यान्वयनको जिम्मा ओलीकै काँधमा !';

function stringToSlug2 (title) {
  return title.toLowerCase().trim()
    .replace(/\s+/g, '-')           // Replace spaces with -
    .replace(/&/g, '-and-')         // Replace & with 'and'
    .replace(/[^\w\-]+/g, '')       // Remove all non-word chars
    .replace(/\-\-+/g, '-')         // Replace multiple - with single -
 }
console.log(stringToSlug2(title2));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

Here I have implemented the above mentioned function with two different languages. Function stringToSlug1 with English and stringToSlug2 with Nepali language. With English text the function is working fine but when the text is in other language above mentioned functions return only -. Result I want to achieve from function stringToSlug2 is घर-घरमा-ग्यास-पाइप-कार्यान्वयनको-जिम्मा-ओलीकै-काँधमा

5
  • 2
    Try [\p{L}\p{Digit}_] instead of \w, \w only matches ASCII (unfortunately). Commented Dec 28, 2017 at 5:21
  • I want to replace special character such as :, !, #@$$#@^%#^ Commented Dec 28, 2017 at 5:26
  • 1
    Replace .replace(/[^\w\-]+/g, '') with .replace(/[^\p{L}\p{Digit}_\-]+/g, '') Commented Dec 28, 2017 at 5:28
  • Hi @Majora320 I got it-t-t-p-pt-i-pt from function stringToSlug1 and ` -` from stringToSlug2 when using the regex you have mentioned. Commented Dec 28, 2017 at 5:32
  • Hmm, JS regexes must not support that. Give me a few mins. Commented Dec 28, 2017 at 5:48

2 Answers 2

1

Unfortunately, the designers of regular expressions (the ones in JavaScript, anyway) did not think much about internationalization when designing them. \w only matches a-z, A-Z, and _, and so [^\w\-]+ means [^a-zA-Z_\-]+. Other dialects of regular expressions have a unicode-enabled word pattern, but your best bet for JavaScript is to have a blacklist of symbols (you mentioned :!#@$$#@^%#^. You can do that with something like [:!#@$$#@^%#^]+ (instead of [^\w\-]+).

Sign up to request clarification or add additional context in comments.

Comments

0

Based on a answer https://stackoverflow.com/a/18936783/5740382.

I have come up with a solution, though it is not good solution (I guess). I will filter some specials character with .replace(/([~!@#$%^&*()_+={}[]\|\:;'<>,./? ])+/g, '-')regex instead of filtering all non-word chars with.replace(/[^\w-]+/g, '')`. So here is my jQuery function.

var title = 'घर-घरमा ग्यास पाइपः कार्यान्वयनको जिम्मा ओलीकै काँधमा !';

function stringToSlug (title) {
  return title.toLowerCase().trim()
  .replace(/\s+/g, '-')           // Replace spaces with -
  .replace(/&/g, '-and-')         // Replace & with 'and'
  .replace(/([~!@#$%^&*()_+=`{}\[\]\|\\:;'<>,.\/? ])+/g, '-') // Replace sepcial character with -
  .replace(/\-\-+/g, '-')         // Replace multiple - with single -
}
console.log(stringToSlug(title));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.