Native JavaScript or ES6 way to encode and decode HTML entities?

Question

Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as <. There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.

There is not a native JavaScript facility. JavaScript the programming language does not really have much to do with HTML, goofy APIs on the String prototype notwithstanding. — Pointy
– Pointy, Commented Oct 26, 2016 at 13:39
@Pointy I think generally speaking you're right. It just feels like since JavaScript is so widely used on the web, and HTML entities are a common feature of web development, something like this would've made its way into the language over the past decade. — Marty Chang
– Marty Chang, Commented Oct 26, 2016 at 14:21
I think the question would benefit from clearly including the existence of such a function in browsers and nodejs standard library in its scope. — hippietrail
– hippietrail, Commented Nov 24, 2017 at 4:19

asafel · Accepted Answer · 2020-04-27 15:58:28Z

34

A nice function using es6 for escaping html:

const escapeHTML = str => str.replace(/[&<>'"]/g, 
  tag => ({
      '&': '&amp;',
      '<': '&lt;',
      '>': '&gt;',
      "'": '&#39;',
      '"': '&quot;'
    }[tag]));

edited Apr 27, 2020 at 15:58

answered Aug 11, 2019 at 9:34

asafel

8111 gold badge8 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Thomas Urban Over a year ago

+1 for being nice and simple and for working without depending on some 3rd-party code ... though the fallback is unnecessary due to the regexp

KyleMit · Accepted Answer · 2019-08-29 03:03:23Z

Roll Your Own ^{(caveat - use HE instead for most use cases)}

For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:

let encode = str => {
  let buf = [];

  for (var i = str.length - 1; i >= 0; i--) {
    buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
  }

  return buf.join('');
}

let decode = str => {
  return str.replace(/&#(\d+);/g, function(match, dec) {
    return String.fromCharCode(dec);
  });
}

Usages:

encode("Hello > © <") // "&#72;&#101;&#108;&#108;&#111;&#32;&#62;&#32;&#169;&#32;&#60;"
decode("Hello &gt; &copy; &#169; &lt;") // "Hello &gt; &copy; © &lt;"

However, you can see this approach has a couple shortcomings:

It encodes even safe characters H → H
It can decode numeric codes (not in the astral plane), but doesn't know anything about full list of html entities / named character codes supported by browsers like >

Use the HE Library (Html Entities)

Support for all standardized named character references
Support for unicode
Works with ambiguous ampersands
Written by Mathias Bynens

Usage:

he.encode('foo © bar ≠ baz 𝌆 qux'); 
// Output : 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

he.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'

1 Comment

Marty Chang Over a year ago

Thanks for the answer (inconvenient as it may) that what I want doesn't exist. Can you post a different solution though? Or just remove the solution link? That linked solution neither decodes HTML entities nor handles & vs. numeric encoding.

Code Guru · Accepted Answer · 2020-01-06 09:58:21Z

To unescape HTML entities, Your browser is smart and will do it for you

Way1

_unescape(html: string) :string { 
   const divElement = document.createElement("div");
   divElement.innerHTML = html;
   return divElement.textContent || tmp.innerText || "";
}

Way2

_unescape(html: string) :string {
     let returnText = html;
     returnText = returnText.replace(/&nbsp;/gi, " ");
     returnText = returnText.replace(/&amp;/gi, "&");
     returnText = returnText.replace(/&quot;/gi, `"`);
     returnText = returnText.replace(/&lt;/gi, "<");
     returnText = returnText.replace(/&gt;/gi, ">");
     return returnText;
}

You can also use underscore or lodash's unescape method but this ignores   and handles only &, <, >, ", and ' characters.

Ryan - Llaver · Accepted Answer · 2021-11-17 06:30:09Z

1

The reverse (decode) of the answer (encode) @rasafel provided:

const decodeEscapedHTML = (str) =>
  str.replace(
    /&(\D+);/gi,
    (tag) =>
      ({
        '&amp;': '&',
        '&lt;': '<',
        '&gt;': '>',
        '&#39;': "'",
        '&quot;': '"',
      }[tag]),
  )

answered Nov 17, 2021 at 6:30

Ryan - Llaver

5265 silver badges20 bronze badges

1 Comment

Bergi Nov 7, 2024 at 2:20

This fails for other entities. And ' will never be matched by &\D+;!

The Jared Wilcurt · Accepted Answer · 2024-11-07 01:59:04Z

The top answer was kinda unreadable, and unmaintainable. I re-wrote it so all you need to do for tweaking it is to add (or remove) key/value pairs to the map. I also swapped ' with ' because I'm assuming you're using HTML5+ which had wide spread browser adoption by 2011.

/**
 * Escapes special HTML characters.
 *
 * @example
 * '<div title="text">1 & 2</div>'
 * becomes
 * '&lt;div title=&quot;text&quot;&gt;1 &amp; 2&lt;/div&gt;'
 *
 * @param  {string} value  Any input string.
 * @return {string}        The same string, but with encoded HTML entities.
 */
export const escapeHtml = function (value) {
  // https://html.spec.whatwg.org/multipage/named-characters.html
  const namedHtmlEntityMap = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '\'': '&apos;',
    '"': '&quot;'
  };
  const charactersToEncode = Object.keys(namedHtmlEntityMap).join('');
  const regexp = new RegExp('[' + charactersToEncode + ']', 'g');
  const encode = function (character) {
    return namedHtmlEntityMap[character];
  };

  return value.replace(regexp, encode);
};

Kamran Gasimov · Accepted Answer · 2023-11-11 19:24:51Z

Simple htmlEncode and htmlDecode

HTML Encode Function

  function encodeHtml(str) {
    let buf = [];

    for (var i = str.length - 1; i >= 0; i--) {
      if (!(/^[a-zA-Z0-9]$/.test(str[i])))
        buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
      else
        buf.unshift(str[i])
    }

    return buf.join('');
  }

HTML Decode function

  function decodeHtml(str) {
    return str.replace(/&#(\d+);/g, function(match, dec) {
      return String.fromCharCode(dec);
    });
  }

Collectives™ on Stack Overflow

Native JavaScript or ES6 way to encode and decode HTML entities?

7 Answers 7

1 Comment

Roll Your Own ^{(caveat - use HE instead for most use cases)}

Use the HE Library (Html Entities)

Related Questions

Comments

1 Comment

Comments

1 Comment

Comments

Simple htmlEncode and htmlDecode

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

1 Comment

Roll Your Own (caveat - use HE instead for most use cases)

Use the HE Library (Html Entities)

Related Questions

Comments

1 Comment

Comments

1 Comment

Comments

Simple htmlEncode and htmlDecode

Comments

Linked

Related

Roll Your Own ^{(caveat - use HE instead for most use cases)}