38

Here's my HTML:

<a>View it in your browser</a>
<div id="html">
    <h1>Doggies</h1>
    <p style="color:blue;">Kitties</p>
</div>

How do I use JavaScript to make the href attribute of my link point to a base64 encoded webpage whose source is the innerHTML of div#html?

I basically want to do the same conversion done here (with the base64 checkbox checked) except for in JavaScript.

2 Answers 2

56

Characteristics of a data-URI

A data-URI with MIME-type text/html has to be in one of these formats:

data:text/html,<HTML HERE>
data:text/html;charset=UTF-8,<HTML HERE>

Base-64 encoding is not necessary. If your code contains non-ASCII characters, such as éé, charset=UTF-8 has to be added.

The following characters have to be escaped:

  • # - Firefox and Opera interpret this character as the marker of a hash (as in location.hash).
  • % - This character is used to escape characters. Escape this character to make sure that no side effects occur.

Additionally, if you want to embed the code in an anchor tag, the following characters should also be escaped:

  • " and/or ' - Quotes mark the value of the attribute.
  • & - The ampersand is used to mark HTML entities.
  • < and > do not have to be escaped inside a HTML attribute. However, if you're going to embed the link in the HTML, these should also be escaped (%3C and %3E)

JavaScript implementation

If you don't mind the size of the data-URI, the easiest method to do so is using encodeURIComponent:

var html = document.getElementById("html").innerHTML;
var dataURI = 'data:text/html,' + encodeURIComponent(html);

If size matters, you'd better strip out all consecutive white-space (this can safely be done, unless the HTML contains a <pre> element/style). Then, only replace the significant characters:

var html = document.getElementById("html").innerHTML;
html = html.replace(/\s{2,}/g, '')   // <-- Replace all consecutive spaces, 2+
           .replace(/%/g, '%25')     // <-- Escape %
           .replace(/&/g, '%26')     // <-- Escape &
           .replace(/#/g, '%23')     // <-- Escape #
           .replace(/"/g, '%22')     // <-- Escape "
           .replace(/'/g, '%27');    // <-- Escape ' (to be 100% safe)
var dataURI = 'data:text/html;charset=UTF-8,' + html;
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the extensive answer. This was really helpful! :)
Note Opera behaves similarly to Firefox regarding #. Chrome and Safari do not attach a special meaning to #.
Small typo in the example at the bottom. If I'm not mistaken, data:text/html,charset=UTF-8 should be data:text/html;charset=UTF-8
@B1KMusic Thanks for bringing it up. The comma must indeed be changed to a semicolon, and a trailing semicolon need to be added. Revised answer.
Would encodeURIComponent work instead of your many uses of the replace method?
|
0

If size matters, you'd better strip out all consecutive white-space (this can safely be done, unless the HTML contains a <pre> element/style). Then, only replace the significant characters:

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.