0

I am building a blog and currently im finishing the admin panel.

Since i will be mostly who will be managing it... i want to make sure that when i type

<ul>
   <li>test</li>
   <li>test</li>
</ul>

will show me the unordered list but also prevent XSS tags just in case...

how could i do that?

could a solution be creating functions and replace the tags of ul, ol, img etc...?

3
  • Concrete answer depends on the web programming language used. If it were for example Java, you could use Jsoup for this. Please edit your question to mention and tag it accordingly. Commented Mar 22, 2012 at 17:06
  • @BalusC missed that, updated question... im using php. Commented Mar 22, 2012 at 17:08
  • 2
    You could use HTML Purifier. Commented Mar 22, 2012 at 17:25

3 Answers 3

3

What you are looking for is an HTML sanitizer. These are very hard to write correctly, so you should look at an existing library. For PHP, have a look at HTML Purifier.

Proper XSS protection involves more than html sanitizing. The Open Web Application Security Project (OWASP) has put together a canonical guide to avoiding XSS attacks:

XSS (Cross Site Scripting) Prevention Cheat Sheet

Sign up to request clarification or add additional context in comments.

Comments

1

Check this url - http://refactormycode.com/codes/333-sanitize-html

There is another useful thread on the issue and how to handle this - What is the best way to store WMD input/markdown in SQL server and display later?

3 Comments

The first URL is an html sanitizer built for StackOverflow, but do note that it only permits a small subset of tags, and doesn't meet the OP's desire to allow "all html tags".
Also, if you want to use markdown as input, check out michelf.com/projects/php-markdown
@colin i pointed user to a method as a sample, OP can see how it can be used to strip XSS tags out and extend/re-use.
1

The standard way to deal with XSS while allowing HTML is to:

  1. run the HTML through a (real) HTML parser
  2. delete any element or attribute that isn't on a whitelist (use a third party whitelist as a starting point, do research on any additional elements/attributes you add to make sure they don't have means to inject JS that you don't know about).
  3. sanity check any URIs
  4. generate clean HTML from the DOM

The specifics will depend on the language you are using.

1 Comment

Item (2) is very hard to do correctly by yourself. For example, if you allow element <a> with attribute href, then <a href="javascript:alert('XSS')">` might come as a surpise to you. And that's before you get to the likes of <span style="{ text-size: "expression(alert('XSS'))"; }">. The sensible thing in this situation is to use a proven library.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.