| XML Name |
Java Name |
First supported in Java |
Scripts and Languages |
| ISO-8859-1 |
8859_1 |
1.1 |
Latin-1: ASCII plus the accented characters needed for most Western European languages including Albanian, Basque, Breton, Catalan, Cornish, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Frisian, Galician, German, Greenlandic, Icelandic, Irish, Italian, Latin, Luxemburgish, Norwegian, Portuguese, Rhaeto-Romanic, Scottish Gaelic, Sorbian, Spanish, and Swedish as well as many non-European languages written in the Latin alphabet such as Swahili and Malaysian |
| ISO-8859-2 |
8859_2 |
1.1 |
Latin-2: ASCII plus the accented characters needed for most Central European languages including Albanian, Croatian, Czech, Finnish, German, Hungarian, Latin, Polish, Romanian, Slovak, Slovenian, and Sorbian |
| ISO-8859-3 |
8859_3 |
1.1 |
Latin-3: ASCII plus the accented characters needed for most Southern European languages including English, Esperanto, Finnish, French, German, Italian, Latin, Maltese, Portuguese, and Turkish |
| ISO-8859-4 |
8859_4 |
1.1 |
Latin-4: ASCII plus the accented characters needed for most Northern European languages including Danish, English, Estonian, Finnish, German, Greenlandic, Latin, Latvian, Lithuanian, Norwegian, S?mi, Slovenian, and Swedish |
| ISO-8859-5 |
8859_5 |
1.1 |
ASCII plus Cyrillic |
| ISO-8859-6 |
8859_6 |
1.1 |
ASCII plus Arabic |
| ISO-8859-7 |
8859_7 |
1.1 |
ASCII plus Greek |
| ISO-8859-8 |
8859_8 |
1.1 |
ASCII plus Hebrew |
| ISO-8859-9 |
8859_9 |
1.1 |
Latin-5: same as Latin-1 except the Turkish letters G, g, I, i, S, and s take the place of the Icelandic letters þ, Þ, ý, Ý, Ð, and ð |
| ISO-8859-13 |
ISO8859_13 |
1.3 |
Latin-7: ASCII plus the accented characters needed for most Baltic languages including Latvian, Lithuanian, Estonian, and Finnish, as well as English, Danish, Swedish, German, Slovenian, and Norwegian. |
| ISO-8859-15 |
ISO8859_15_FDIS |
1.2 |
Latin-9: same as Latin-1 but with the Euro sign € instead of the international currency symbol ¤. It also replaces the infrequently used symbol characters ¦, ¨, ´, ¸, ¼, ½, and ¾ with the infrequently used French and Finnish letters Š, š, Ž, ž, Œ, œ, and Ÿ. |
| UTF-8 |
UTF8 |
1.1 |
The default encoding of XML documents; each Unicode character is represented in between 1 and 4 bytes. |
| UTF-16 |
UnicodeBig or UnicodeLittle |
1.2 |
An encoding of Unicode in which characters in the Basic Multilingual Plane are encoded in two bytes, and all other characters are encoded as two two-byte surrogates |
| ISO-10646-UCS-2 |
N/A |
N/A |
A straightforward encoding in which each Unicode character is represented as a two-byte integer; cannot represent characters outside the Basic Multilingual Plane |
| ISO-10646-UCS-4 |
N/A |
N/A |
A straightforward encoding in which each Unicode character is represented as a four-byte integer |
| ISO-2022-JP |
JIS |
1.1 |
Japanese |
| Shift_JIS |
SJIS |
1.1 |
Japanese |
| EUC-JP |
EUCJIS |
1.1 |
Japanese |
| US-ASCII |
ASCII |
1.2 |
English |
| GBK |
GBK |
1.1 |
Simplified Chinese |
| Big5 |
Big5 |
1.1 |
Traditional Chinese |
| ISO-2022-CN |
ISO2022CN |
1.1 |
Traditional Chinese |
| ISO-2022-KR |
ISO2022KR |
1.1 |
Korean |