Binary encoding is one of the first things developers encounter when learning how computers represent data. Every character you type — a letter, a digit, an emoji — is stored as a sequence of 0s and 1s. This guide explains how text-to-binary conversion works, covers UTF-8 encoding, and shows you how to do it in code.
Convert text to binary online →
How Text-to-Binary Encoding Works
Each character is mapped to a number via a character encoding standard, then that number is expressed in binary (base-2). Modern systems use UTF-8, which is backward-compatible with ASCII and supports the full Unicode character set.
ASCII Characters (1 byte each)
For standard ASCII characters (code points 0–127), each character maps to exactly one byte — 8 bits:
H = 72 = 01001000
e = 101 = 01100101
l = 108 = 01101100
l = 108 = 01101100
o = 111 = 01101111
“Hello” in spaced binary:
01001000 01100101 01101100 01101100 01101111
Reading a Binary Value
To decode a single byte manually:
01001000
↑↑↑↑↑↑↑↑
Each bit is a power of 2, right to left:
Position: 7 6 5 4 3 2 1 0
Bit: 0 1 0 0 1 0 0 0
Value: 0 +64+ 0 + 0 + 8 + 0 + 0 + 0 = 72 = 'H'
UTF-8 Multi-Byte Characters
UTF-8 uses variable-length encoding. Characters outside the ASCII range require 2, 3, or 4 bytes:
| Code point range | Bytes |
|---|---|
| U+0000 – U+007F | 1 byte (ASCII) |
| U+0080 – U+07FF | 2 bytes |
| U+0800 – U+FFFF | 3 bytes (most CJK characters) |
| U+10000 – U+10FFFF | 4 bytes (emoji, rare scripts) |
Examples:
© (U+00A9) = 2 bytes: 11000010 10101001
中 (U+4E2D) = 3 bytes: 11100100 10111000 10101101
😀 (U+1F600) = 4 bytes: 11110000 10011111 10011000 10000000
This is why the same character can produce different numbers of 8-bit groups — it depends on where it falls in the Unicode table.
Text to Binary: Online Tool
The fastest way to convert text to binary (and back) in your browser:
Try the ZeroTool Text to Binary Converter →
- Paste text → binary output updates instantly
- Choose Spaced (01001000 01100101…) or Continuous (0100100001100101…) format
- Paste binary → click Binary → Text to decode
- All processing runs in your browser — nothing is sent to a server
Converting in Code
JavaScript (Browser)
The browser’s TextEncoder API gives you the UTF-8 byte array:
function textToBinary(text) {
const encoder = new TextEncoder(); // UTF-8 by default
const bytes = encoder.encode(text);
return Array.from(bytes)
.map(byte => byte.toString(2).padStart(8, '0'))
.join(' ');
}
console.log(textToBinary('Hi'));
// 01001000 01101001
console.log(textToBinary('中'));
// 11100100 10111000 10101101
Decoding binary back to text:
function binaryToText(binary) {
const bytes = binary
.trim()
.replace(/\s+/g, ' ')
.split(' ')
.map(b => parseInt(b, 2));
const decoder = new TextDecoder('utf-8');
return decoder.decode(new Uint8Array(bytes));
}
console.log(binaryToText('01001000 01101001'));
// Hi
Python
def text_to_binary(text: str) -> str:
bytes_data = text.encode('utf-8')
return ' '.join(format(byte, '08b') for byte in bytes_data)
def binary_to_text(binary: str) -> str:
parts = binary.strip().split()
byte_values = [int(b, 2) for b in parts]
return bytes(byte_values).decode('utf-8')
print(text_to_binary('Hello'))
# 01001000 01100101 01101100 01101100 01101111
print(binary_to_text('01001000 01100101 01101100 01101100 01101111'))
# Hello
Command Line
# Text to binary (Python one-liner)
python3 -c "
import sys
text = sys.argv[1]
print(' '.join(format(b, '08b') for b in text.encode('utf-8')))
" "Hello"
# 01001000 01100101 01101100 01101100 01101111
Spaced vs Continuous Format
Two common binary string formats:
| Format | Example | Use case |
|---|---|---|
| Spaced | 01001000 01100101 | Human-readable, each byte separated |
| Continuous | 0100100001100101 | Compact, used in some protocols |
When decoding continuous binary, the string length must be a multiple of 8. If it’s not, the string is malformed or truncated.
Common Use Cases
Education — Binary encoding is a core CS concept. Converting familiar strings (your name, “Hello World”) to binary makes the abstraction concrete.
Debugging encoding issues — When a string appears garbled, inspecting the raw binary helps pinpoint where the encoding went wrong (wrong codec, truncated multi-byte sequence, BOM issues).
CTF and security challenges — Binary/text conversion is a staple in capture-the-flag competitions. Hidden messages are often encoded in binary.
Protocol design — Understanding how strings map to bytes matters when working with binary protocols, file formats, or network packets.
Binary vs Other Encodings
| Encoding | Output for “Hi” | Size | Readable? |
|---|---|---|---|
| Binary | 01001000 01101001 | ~2× | Barely |
| Hex | 48 69 | Compact | Medium |
| Base64 | SGk= | ~4/3× original | Yes |
| ASCII decimal | 72 105 | Compact | Medium |
Use Base64 when you need a text-safe encoding with lower overhead. Use ASCII Converter when you want decimal, hex, and octal values side by side.
Summary
- Text → binary works by encoding each character as UTF-8 bytes, then expressing each byte in 8-bit binary.
- ASCII characters use 1 byte; emoji and many Unicode characters use 2–4 bytes.
- Spaced format separates bytes with spaces; continuous format has no separators.
- Use
TextEncoder/TextDecoderin JavaScript and.encode('utf-8')in Python.