It's really simple:
- To avoid SQL injection,
mysql_real_escape_string your values before concatenating them into an SQL query, or use parameterized queries that don't suffer from malformed strings in the first place.
- To avoid XSS problems and/or messed up HTML, HTML escape your values before plugging them into an HTML context.
- JSON escape them in a JSON context, CSV escape them in a CSV context, etc pp.
All are the same problem, really. As a very simple example, to produce the string "test" (I want the quotes to be part of the string), I can't write the string literal $foo = ""test"". I have to escape the quotes within the quotes to make clear which quotes are supposed to end the string and which are part of the string: $foo = "\"test\"".
SQL injection, XSS problems and messed up HTML are all just a variation on this.
To plug a value that contains quotes into a query, you have the same problem as above:
$comment = "\"foo\""; // comment is "foo", including quotes
$query = 'INSERT INTO `db` (`comment`) VALUES ("' . $comment . '")';
// INSERT INTO `db` (`comment`) VALUES (""foo"")
That produces invalid syntax at best, SQL injection attacks at worst. Using mysql_real_escape_string avoids this:
$query = 'INSERT INTO `db` (`comment`) VALUES ("' . mysql_real_escape_string($comment) . '")';
// INSERT INTO `db` (`comment`) VALUES ("\"foo\"")
HTML escaping is exactly the same, just with different syntax issues.
You only need to escape your values in the right context using the right method. To escape values for HTML, use htmlentities. Do that at the time it's necessary. Don't prematurely or over-escape your values, only apply the appropriate escape function in the right context at the right time.