Before Unicode, different computing systems used different character encodings to represent text. Each encoding maps byte values (0-255) to specific characters. While the first 128 characters (standard ASCII) are consistent across most encodings, the upper 128 characters (128-255) vary significantly between different character sets.
The original IBM PC character set, used in DOS and the BIOS.
View full table →The default character encoding for Windows in Western languages.
View full table →The standard Western European character encoding.
View full table →Character encoding for Central European languages.
View full table →The first 128 characters are identical across all common encodings. This is the standard ASCII set including control characters, digits, uppercase and lowercase letters, and basic punctuation.
The upper 128 byte values are where encodings differ. CP437 uses box-drawing characters and Greek letters. Windows-1252 adds smart quotes and the Euro sign. ISO 8859 variants serve different language groups.
Modern systems use Unicode (usually encoded as UTF-8) which supports over 140,000 characters from all writing systems. UTF-8 is backward-compatible with ASCII for the first 128 characters.
Understanding character encodings is crucial for working with international text, debugging garbled characters (mojibake), parsing legacy data files, and ensuring correct text display across systems.