ASCII/Unicode Converter
Convert between text and ASCII/Unicode code points
Understanding ASCII and Unicode
Character encoding is a method of representing characters as numbers for storage and processing by computers. ASCII and Unicode are two of the most important character encoding standards used in computing.
ASCII (American Standard Code for Information Interchange)
ASCII is one of the earliest character encoding standards, developed in the 1960s. It uses 7 bits to represent each character, allowing for 128 different characters (0-127). These include:
- Control characters (0-31 and 127)
- Printable characters including numbers, uppercase and lowercase Latin letters, and basic punctuation
ASCII is limited to English and basic symbols, which led to the development of extended ASCII and eventually Unicode.
Unicode
Unicode is a modern character encoding standard designed to represent text from all writing systems of the world. It currently contains over 140,000 characters and continues to expand. Unicode assigns each character a unique code point, represented as U+XXXX (where XXXX is a hexadecimal number).
Common Unicode encoding formats include:
- UTF-8: A variable-width encoding that uses 1 to 4 bytes per character
- UTF-16: Uses 2 or 4 bytes per character
- UTF-32: Uses a fixed 4 bytes per character
UTF-8 is the most widely used encoding on the web because it's backward compatible with ASCII (the first 128 characters are encoded identically) and efficiently handles a wide range of characters.