1

I've trouble parsing tweets which are represented as escaped unicode some found to be foreign language strings e.g \u064a\u0633\u0639\u062f\u0646\u064a

0

2 Answers 2

1

Using org.apache.commons.lang.StringEscapeUtils.

String s="\\u0048\\u0065\\u006C\\u006C\\u006F";
System.out.println(StringEscapeUtils.unescapeJava(s));

P.S. Oops, I didn't refresh this page before I post the answer, the comments above conveys the same thing.

Sign up to request clarification or add additional context in comments.

Comments

0

you can try str = org.apache.commons.lang.StringEscapeUtils.unescapeJava(str);

from apache commons

check http://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringEscapeUtils.html

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.