About charset="utf-8"
What's a Character Set?

  • The charset attribute specifies the character encoding for the HTML document
  • A character set is a fixed collection of symbols. For example, the English alphabet “A” to “Z” and “a” to “z” can be a character set, with a total of 52 symbols.
Common values:
  • UTF-8 - Character encoding for Unicode
UTF-8 means Unicode Transformation Format (UTF).

(U from Universal Character Set + Transformation Format—8-bit) is a character encoding capable of encoding all possible characters (called code points) in Unicode. The encoding is variable-length and uses 8-bit code units.

The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.

Because the character sets in ISO-8859 was limited in size, and not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.

The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.

Unicode enables processing, storage, and transport of text independent of platform and language.

The default character encoding in HTML-5 is UTF-8.
  • ISO-8859-1 - Character encoding for the Latin alphabet
In theory, any character encoding can be used, but no browser understands all of them. The more widely a character encoding is used, the better the chance that a browser will understand it.

Syntax:

<meta charset="character_set">

Tip: The first 128 characters of Unicode (which correspond one-to-one with ASCII) are encoded using a single octet with the same binary value as ASCII, making valid ASCII text valid UTF-8-encoded Unicode as well.

Comments

Popular posts from this blog

Bootstrap Breakpoints