Replacing characters between strings with html tags in react

Question

I'm trying to parse some text so that _this is emphasized!_ is wrapped in  tags like so: this is emphasized!.

My component currently looks like this:

export default class TextParser extends React.Component {
  render() {
    let text = this.props.text,
        parsed, regex, paragraphs;

    regex = {
      paragraph: /(?:\r\n){2,}/g,
      emphasize: /\_(.*?)\_/g,
      strong: /\*(.*?)\*/g,
    }

    // Apply regex
    text = text.replace(regex.emphasize, (str) => {
      let parsed = str.substr(1, str.length - 1);

      return ('<em>' + parsed + '</em>')
    })

    paragraphs = text.split(regex.paragraph) || []
    paragraphs = paragraphs.map((text, i) => {
      return (
        <p key={i}>
          {text}
        </p>
      )
    })

    return (
      <div className="document">{paragraphs}</div>
    )
  }
}

This does not work, however the output html displays the tags in plain text instead of using them in the html. This is of course because of sanitization.

I could dangerouslySetInnerHTML but I want to avoid that. How can I replace the underscores between text with  tags?

MatthewG · Accepted Answer · 2016-10-21 21:19:11Z

As you noticed, placing the string "" as part of the result of replace just adds that string and not an actual tag.

You will not be able create tags directly inside of replace because that is operating on a string.

Instead, break the string up into separate elements and add the tags where you need them. You already do something like this in the paragraph case.

Because the paragraph case also operates on a string, these kind of operations can only be done nested, since once you complete the operation you no longer have a plain text string, you have an array of objects. So in this example I moved the  parsing inside the paragraph parsing.

One last note, I had to modify the regex for emphasize so that it captured the underscores, because I need to check again whether it was a match or not after I have done the split.

let text = this.props.text,
    parsed, regex, paragraphs;

regex = {
  paragraph: /(?:\r\n){2,}/g,
  emphasize: /(\_.*?\_)/g,
  strong: /\*(.*?)\*/g,
}

paragraphs = text.split(regex.paragraph) || []
paragraphs = paragraphs.map((text, i) => {
  return (
    <p key={i}>
      {        
           // Apply regex
           text.split(regex.emphasize).map((str) => {
           let parsed = str.search(regex.emphasize) !== -1 
              ? (<em>{str.substr(1, str.length - 2)}</em>) 
              : str;
            return parsed;
        })}
    </p>
  )
})

return (
  <div className="document">{paragraphs}</div>
)

Based on your comments below, you also want to know how to handle either/or formatting case. So for completeness I have included the code for that here. I chose to combine the formatting patterns into a single regex, and then I explicitly check for '_' or '*' to decide whether to add em or b tags. I then recursively call this when there is a match, in case there are additional matches within. You may choose to clean this up differently, but I hope this helps.

let text = this.props.text,
    parsed, regex, paragraphs;

regex = {
  paragraph: /(?:\r\n){2,}/g,
  formatting: /(\_.*?\_)|(\*.*?\*)/g,
}

  let applyFormatting = (text) => {
    return text.split(regex.formatting).filter(n => n).map((str) => {
    let parsed = str[0] == '_'
        ? (<em>{applyFormatting(str.substr(1, str.length - 2))}</em>)
        : str[0] == '*'
        ? (<b>{applyFormatting(str.substr(1, str.length - 2))}</b>)
        : str;
    return parsed;
  });
};

paragraphs = text.split(regex.paragraph) || []
paragraphs = paragraphs.map((text, i) => {
  return (
    <p key={i}>
      { applyFormatting(text) }
    </p>
  )
})

return (
  <div className="document">{paragraphs}</div>
)

Great answer! Is there a dry way to make it parse the * characters and turn them into  as well?
This might be a bit trickier if it is possible to nest them. If not, then it is probably not too tough. Can we ignore the possibility of Italicized *and bold* text or do you need to support that type of nested emphasize/strong?
I need both, and possibly more in the future. This is so users can have rich text in their content.
I see, so you cannot just split once and then apply the tags. You will need to each time you find a match, apply the tags, and then continue to search for the other matches. Since they can be in any order (e.g. bold _and italic_ text) this will have to be done so that it finds the outermost tags first and then continues to search the resulting substrings.
This sounds incredibly complicated. Regex is such a braintwister for me. Can you help me?

Collectives™ on Stack Overflow

Replacing characters between strings with html tags in react

1 Answer 1

7 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Related