Character set
The alphabet a particular symbology can encode - digits only, alphanumeric, full ASCII, or beyond.
Why character sets matter
A symbology is not a free-form text encoder. Each one supports a specific list of characters - some only digits, some letters and digits, some the full 128-character ASCII alphabet, some byte-mode and Kanji. Choosing the wrong symbology for the data you need to carry is one of the most common mistakes when designing labels: a serial number with an embedded letter cannot go in an EAN-13 , and a Chinese product name cannot go in a Code 39.
The character set is part of the symbology specification; it determines what module patterns are defined and which check-digit arithmetic can be applied.
Numeric-only symbologies
Encode digits 0–9 and nothing else. Smallest possible character set, smallest possible printed footprint for the same number of characters.
| Symbology | Length | Notes |
|---|---|---|
| EAN-13 | 13 digits (incl. check) | Worldwide retail POS |
| EAN-8 | 8 digits | Small items |
| UPC-A | 12 digits | US/Canada retail POS |
| UPC-E | 8 digits (compressed UPC-A) | Very small items |
| Interleaved 2 of 5 | Even number of digits | Cartons, ITF-14 |
| ITF-14 | 14 digits (fixed) | Outer-case GTIN-14 |
| Industrial 2 of 5 | Variable digits | Older industrial labels |
| Code 11 | Digits 0–9 plus - | Telecoms equipment |
| PLANET , IMb | 20/25/29/31 digits | USPS mail tracking |
Alphanumeric symbologies (43-character)
Encode digits 0–9, the 26 uppercase letters A–Z, plus a handful of special characters. Total alphabet is 43 characters - hence "mod-43" check-digit arithmetic.
0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z - . space $ / + %The symbologies that share this 43-character alphabet:
Full ASCII (128-character)
Encode the complete 128-character ASCII alphabet - digits, both letter cases, punctuation, control codes. The data set is internally divided into three subsets that the encoder switches between to save modules.
| Symbology | Character set | Subset trick |
|---|---|---|
| Code 128 | Full ASCII (0–127) | Three character subsets: A (control + uppercase + digits), B (printable ASCII + lowercase), C (numeric pairs - two digits per character for high density). Encoder picks the optimal subset and switches with shift codes. |
| GS1-128 | Code 128 + FNC1 separator | The first character is FNC1 (a Code 128 control character); subsequent FNC1s separate variable-length fields. The character set is the same as Code 128. |
2D symbologies: multiple data modes
2D barcodes support several internal encoding modes and switch between them for efficiency - numeric data packs more densely than mixed alphanumeric, and a byte mode covers anything outside the structured alphabets.
| Symbology | Modes |
|---|---|
| QR Code | Numeric, alphanumeric (45-char), byte (ISO-8859-1 default; UTF-8 via ECI), Kanji (Shift JIS), and ECI for switching to other encodings. Per-segment mode switch within one symbol. |
| Data Matrix | ASCII, C40 (uppercase + digits, 3 chars per 2 codewords), Text (lowercase variant), X12 (EDI), EDIFACT, Base 256 (byte mode). Per-segment mode switch. |
| PDF417 | Text (alphanumeric + punctuation), Byte, Numeric. Per-segment mode switch with shift codes. |
| Aztec | Upper, Lower, Mixed, Punct, Digit, Byte modes. Designed for high efficiency on short data. |
Quick reference: pick a symbology by character set
| If your data is… | Use… |
|---|---|
| Digits only, retail GTIN | EAN-13 / UPC-A (or QR Digital Link) |
| Digits only, outer carton GTIN-14 | ITF-14 |
| Mixed uppercase letters and digits, short | Code 39 or LOGMARS |
| Mixed case, special characters, variable length | Code 128 or GS1-128 with AIs |
| Anything longer than ~30 characters | QR Code, Data Matrix, PDF417, or Aztec |
| Unicode / non-Latin text | QR (Kanji + byte/ECI modes) or Data Matrix (byte mode + ECI) |
| Embedded URL for consumer scanning | QR (GS1 Digital Link if retail) |