Code page

In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some contexts these terms are used more precisely; see Character encoding § Terminology.)

The term "code page" originated from IBM's EBCDIC-based mainframe systems,^[1] but Microsoft, SAP,^[2] and Oracle Corporation^[3] are among the vendors that use this term. The majority of vendors identify their own character sets by a name. In the case when there is a plethora of character sets (like in IBM), identifying character sets through a number is a convenient way to distinguish them. Originally, the code page numbers referred to the page numbers in the IBM standard character set manual,^[4]^[5]^[6] a condition which has not held for a long time. Vendors that use a code page system allocate their own code page number to a character encoding, even if it is better known by another name; for example, UTF-8 has been assigned page numbers 1208 at IBM, 65001 at Microsoft, and 4110 at SAP.

Hewlett-Packard uses a similar concept in its HP-UX operating system and its Printer Command Language^[7] (PCL) protocol for printers (either for HP printers or not). The terminology, however, is different: What others call a character set, HP calls a symbol set, and what IBM or Microsoft call a code page, HP calls a symbol set code. HP developed a series of symbol sets,^[8]^[9] each with an associated symbol set code, to encode both its own character sets and other vendors’ character sets.

The multitude of character sets leads many vendors to recommend Unicode.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

ID	Names	Description	Origin	Platform	DOS	OS/2	Windows	Mac	Else	Encoding	Comment
0	—	Reserved	IBM, Microsoft	—	3.3+	1.0+	?	?	?		Internal OS use^[34]
437	CP437, IBM437	PC US	IBM^[46]	IBM PC	3.3+	1.0+	Yes	?	Yes	8-bit SBCS
57344 - 61439	—	Private use derivations	IBM	—	—	—	—	—	—	various	Private use code page derivations (E000h-EFFFh)
65280 - 65533	—	Private use definitions	IBM	—	—	—	—	—	—	various	Private use code page definitions (FF00h-FFFDh)
65534	—	Reserved	IBM, Microsoft	—	?	?	?	?	?	various	Internal OS use (FFFEh)
65535	—	Reserved	IBM, Microsoft	—	3.3+	1.0+	?	?	?	various	Internal OS use (FFFFh)^[34]

Code page

The code page numbering system

Relationship to ASCII

Relationship to Unicode

IBM code pages

EBCDIC-based code pages

DOS code pages

IBM AIX code pages

IBM OS/2 code pages

Windows emulation code pages

Macintosh emulation code pages

Adobe emulation code pages

HP emulation code pages

DEC emulation code pages

IBM Unicode code pages

Microsoft code pages

Windows code pages

DBCS code pages

MS-DOS code pages

Macintosh emulation code pages

Various other Microsoft code pages

Microsoft Unicode code pages

HP Symbol Sets

HP own Symbol Sets

Symbol Sets from other vendors

Code pages from other vendors

List of code page assignments

Criticism

Private code pages

See also

References

External links

Wikiwand - on