UTF-8 (Unicode Transformation Format, 8 bits) is a character encoding that describes each Unicode code point using a byte sequence of one to six bytes. It is backwards-compatible with ASCII while still supporting representation of all Unicode code points.
UTF-8 is a character encoding that can describe the set of unicode code points in byte sequences of one to six bytes.
UTF-8 is the most widely used character encoding, and is recommended for use on the Internet. It is the standard character encoding on linux and other recent unix-like operating systems. It was designed to be backwards-compatible with ascii while still supporting representation of all Unicode code points.
The algorithm for encoding code points in UTF-8 is described in RFC 3629.