Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as <. There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.
-
3There is not a native JavaScript facility. JavaScript the programming language does not really have much to do with HTML, goofy APIs on the String prototype notwithstanding.Pointy– Pointy2016-10-26 13:39:02 +00:00Commented Oct 26, 2016 at 13:39
-
1@Pointy I think generally speaking you're right. It just feels like since JavaScript is so widely used on the web, and HTML entities are a common feature of web development, something like this would've made its way into the language over the past decade.Marty Chang– Marty Chang2016-10-26 14:21:47 +00:00Commented Oct 26, 2016 at 14:21
-
I think the question would benefit from clearly including the existence of such a function in browsers and nodejs standard library in its scope.hippietrail– hippietrail2017-11-24 04:19:31 +00:00Commented Nov 24, 2017 at 4:19
7 Answers
A nice function using es6 for escaping html:
const escapeHTML = str => str.replace(/[&<>'"]/g,
tag => ({
'&': '&',
'<': '<',
'>': '>',
"'": ''',
'"': '"'
}[tag]));
1 Comment
Roll Your Own (caveat - use HE instead for most use cases)
For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:
let encode = str => {
let buf = [];
for (var i = str.length - 1; i >= 0; i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
}
let decode = str => {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}
Usages:
encode("Hello > © <") // "Hello > © <"
decode("Hello > © © <") // "Hello > © © <"
However, you can see this approach has a couple shortcomings:
- It encodes even safe characters
H→H - It can decode numeric codes (not in the astral plane), but doesn't know anything about full list of html entities / named character codes supported by browsers like
>
Use the HE Library (Html Entities)
- Support for all standardized named character references
- Support for unicode
- Works with ambiguous ampersands
- Written by Mathias Bynens
Usage:
he.encode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'
he.decode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz 𝌆 qux'
Related Questions
Comments
There is no native function in the JavaScript API that convert ASCII characters to their "html-entities" equivalent. Here is a beginning of a solution and an easy trick that you may like
1 Comment
& vs. numeric encoding.To unescape HTML entities, Your browser is smart and will do it for you
Way1
_unescape(html: string) :string {
const divElement = document.createElement("div");
divElement.innerHTML = html;
return divElement.textContent || tmp.innerText || "";
}
Way2
_unescape(html: string) :string {
let returnText = html;
returnText = returnText.replace(/ /gi, " ");
returnText = returnText.replace(/&/gi, "&");
returnText = returnText.replace(/"/gi, `"`);
returnText = returnText.replace(/</gi, "<");
returnText = returnText.replace(/>/gi, ">");
return returnText;
}
You can also use underscore or lodash's unescape method but this ignores and handles only &, <, >, ", and ' characters.
Comments
The reverse (decode) of the answer (encode) @rasafel provided:
const decodeEscapedHTML = (str) =>
str.replace(
/&(\D+);/gi,
(tag) =>
({
'&': '&',
'<': '<',
'>': '>',
''': "'",
'"': '"',
}[tag]),
)
1 Comment
' will never be matched by &\D+;!The top answer was kinda unreadable, and unmaintainable. I re-wrote it so all you need to do for tweaking it is to add (or remove) key/value pairs to the map. I also swapped ' with ' because I'm assuming you're using HTML5+ which had wide spread browser adoption by 2011.
/**
* Escapes special HTML characters.
*
* @example
* '<div title="text">1 & 2</div>'
* becomes
* '<div title="text">1 & 2</div>'
*
* @param {string} value Any input string.
* @return {string} The same string, but with encoded HTML entities.
*/
export const escapeHtml = function (value) {
// https://html.spec.whatwg.org/multipage/named-characters.html
const namedHtmlEntityMap = {
'&': '&',
'<': '<',
'>': '>',
'\'': ''',
'"': '"'
};
const charactersToEncode = Object.keys(namedHtmlEntityMap).join('');
const regexp = new RegExp('[' + charactersToEncode + ']', 'g');
const encode = function (character) {
return namedHtmlEntityMap[character];
};
return value.replace(regexp, encode);
};
Comments
Simple htmlEncode and htmlDecode
HTML Encode Function function encodeHtml(str) {
let buf = [];
for (var i = str.length - 1; i >= 0; i--) {
if (!(/^[a-zA-Z0-9]$/.test(str[i])))
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
else
buf.unshift(str[i])
}
return buf.join('');
}
HTML Decode function
function decodeHtml(str) {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
}