About charset="utf-8"
What's a Character Set?- The charset attribute specifies the character encoding for the HTML document
- A character set is a fixed collection of symbols. For example, the English alphabet “A” to “Z” and “a” to “z” can be a character set, with a total of 52 symbols.
Common values:
- UTF-8 - Character encoding for Unicode
(U from Universal Character Set + Transformation Format—8-bit) is a character encoding capable of encoding all possible characters (called code points) in Unicode. The encoding is variable-length and uses 8-bit code units.
The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.
Because the character sets in ISO-8859 was limited in size, and not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.
The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.
Unicode enables processing, storage, and transport of text independent of platform and language.
The default character encoding in HTML-5 is UTF-8.
- ISO-8859-1 - Character encoding for the Latin alphabet
Syntax:
<meta charset="character_set">Tip: The first 128 characters of Unicode (which correspond one-to-one with ASCII) are encoded using a single octet with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well.
Comments
Post a Comment