ISO-IR-111

RFC 1345's "ECMA-Cyrillic"
Language(s)	Russian, Belarusian, Macedonian, Serbian
Standard	RFC 1345
Classification	Extended ASCII
Transforms / Encodes	ISO-IR-111
Other related encoding(s)	ISO-8859-5, Windows-1251
	v; t; e;

KOI8-E (1986)
Alias(es)	ISO-IR-111
Language(s)	Russian, Belarusian, Macedonian, Serbian, Ukrainian (partial)
Standard	ECMA-113:1986
Classification	Extended ASCII, KOI
Extends	KOI8-B
Succeeded by	ECMA-113:1988 (ISO-8859-5)
Other related encoding(s)	KOI8-F
	v; t; e;

ISO-IR-111^[1] or KOI8-E^[2] is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022.

It was defined by the first (1986) edition of ECMA-113,^[3] which is the Ecma International standard corresponding to ISO/IEC 8859-5, and as such also corresponds to a 1987 draft version of ISO-8859-5.^[4] The published editions of ISO/IEC 8859-5 instead correspond to subsequent editions of ECMA-113, which defines a different encoding.^[5]

Naming confusion

ISO-IR-111, the 1985 edition of ECMA-113 (also called "ECMA-Cyrillic" or "KOI8-E"), was based on the 1974 edition of GOST 19768 (i.e. KOI-8). In 1987 ECMA-113 was redesigned.^[5] These newer editions of ECMA-113 are equivalent to ISO-8859-5,^[5]^[6] and do not follow the KOI layout. This confusion has led to a common misconception that ISO-8859-5 was defined in or based on GOST 19768-74.^[6]

Possibly as another consequence of this, RFC 1345 erroneously lists a different codepage under the names "ISO-IR-111" and "ECMA-Cyrillic", resembling ISO-8859-5 with re-ordered rows, and partially compatible with Windows-1251.^[7]^[6] Due to concerns that existing implementations might use the RFC 1345 definition for those two labels, it was proposed that the IANA additionally recognise KOI8-E as a label for ECMA-113:1985 content,^[7] and the IANA presently lists that label as an alias.^[2]

Character set

The following table shows the ISO-IR-111 encoding. Each character is shown with its equivalent Unicode code point.

ISO-IR-111
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
0x
1x
2x	SP	!	"	#	$	%	&	'	(	)	*	+	,	-	.	/
3x	0	1	2	3	4	5	6	7	8	9	:	;	<	=	>	?
4x	@	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O
5x	P	Q	R	S	T	U	V	W	X	Y	Z	[	\	]	^	_
6x	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o
7x	p	q	r	s	t	u	v	w	x	y	z	{	\|	}	~
8x
9x
Ax	NBSP	ђ 0452	ѓ 0453	ё 0451	є 0454	ѕ 0455	і 0456	ї 0457	ј 0458	љ 0459	њ 045A	ћ 045B	ќ 045C	SHY	ў 045E	џ 045F
Bx	№ 2116	Ђ 0402	Ѓ 0403	Ё 0401	Є 0404	Ѕ 0405	І 0406	Ї 0407	Ј 0408	Љ 0409	Њ 040A	Ћ 040B	Ќ 040C	¤ 00A4	Ў 040E	Џ 040F
Cx	ю 044E	а 0430	б 0431	ц 0446	д 0434	е 0435	ф 0444	г 0433	х 0445	и 0438	й 0439	к 043A	л 043B	м 043C	н 043D	о 043E
Dx	п 043F	я 044F	р 0440	с 0441	т 0442	у 0443	ж 0436	в 0432	ь 044C	ы 044B	з 0437	ш 0448	э 044D	щ 0449	ч 0447	ъ 044A
Ex	Ю 042E	А 0410	Б 0411	Ц 0426	Д 0414	Е 0415	Ф 0424	Г 0413	Х 0425	И 0418	Й 0419	К 041A	Л 041B	М 041C	Н 041D	О 041E
Fx	П 041F	Я 042F	Р 0420	С 0421	Т 0422	У 0423	Ж 0416	В 0412	Ь 042C	Ы 042B	З 0417	Ш 0428	Э 042D	Щ 0429	Ч 0427	Ъ 042A

Extended and modified versions

A modified version named KOI8 Unified or KOI8-F was used in software produced by Fingertip Software, adding the Ґ in its KOI8-U location (replacing the soft hyphen and displacing the universal currency sign), and adding some graphical characters in the C1 control codes area, mainly from KOI8-R and Windows-1251.^[4]^[6]^[8]^[9]

Incorrect RFC 1345 code page

RFC 1345 erroneously lists a different code page under the name ISO-IR-111, encoding the same Cyrillic characters but with a different layout. It resembles a mixture of Windows-1251 and ISO-8859-5.^[7] Specifically, line A_ corresponds to ISO-8859-5, lines C_ through F_ correspond to Windows-1251^[6] (equivalent to lines B_ through E_ of ISO-8859-5), and line B_ nearly corresponds to line F_ of ISO-8859-5, with the exception of the § being replaced with a ¤.

Certain codes resemble ISO-IR-111 with flipped letter case, which may have contributed to the confusion. The majority differ and are shown below.

Code page erroneously labelled "ISO-IR-111" or "ECMA-Cyrillic" in RFC 1345
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
Ax	NBSP	Ё	Ђ	Ѓ	Є	Ѕ	І	Ї	Ј	Љ	Њ	Ћ	Ќ	SHY	Ў	Џ
Bx	№	ё	ђ	ѓ	є	ѕ	і	ї	ј	љ	њ	ћ	ќ	¤	ў	џ
Cx	А	Б	В	Г	Д	Е	Ж	З	И	Й	К	Л	М	Н	О	П
Dx	Р	С	Т	У	Ф	Х	Ц	Ч	Ш	Щ	Ъ	Ы	Ь	Э	Ю	Я
Ex	а	б	в	г	д	е	ж	з	и	й	к	л	м	н	о	п
Fx	р	с	т	у	ф	х	ц	ч	ш	щ	ъ	ы	ь	э	ю	я

Deviating from ISO-IR-111 (excluding deviations in case only)

References

^ ECMA (1 August 1985). Right-hand Part of the Cyrillic Alphabet (PDF). ITSCJ/IPSJ. ISO-IR-111.
^ a b "Character Sets". IANA.
^ ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (1st ed., June 1986)
^ a b Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
^ a b c ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (2nd ed., June 1988)
^ a b c d e Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
^ a b c Sokolov, Michael (2003-04-05). "ECMA-cyrillic alias iso-ir-111 sore". IETF Charsets Mailing List.
^ "KOI8 Unified". Fingertip Software. Archived from the original on 1998-01-09. Retrieved 2020-02-11.
^ Leisher, Mark (2008) [1998-03-05]. "KOI8 Unified Cyrillic to Unicode 2.1 mapping table". Department of Mathematical Sciences, New Mexico State University. Retrieved 2020-05-02.

[1] ECMA (1 August 1985). Right-hand Part of the Cyrillic Alphabet (PDF). ITSCJ/IPSJ. ISO-IR-111.

[iana-2] "Character Sets". IANA.

[3] ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (1st ed., June 1986)

[Czyborra_1998_Cyrillic-4] Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.

[1988ecma-5] ECMA-113. 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet (2nd ed., June 1988)

[segfault-6] Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.

[sore-7] Sokolov, Michael (2003-04-05). "ECMA-cyrillic alias iso-ir-111 sore". IETF Charsets Mailing List.

[8] "KOI8 Unified". Fingertip Software. Archived from the original on 1998-01-09. Retrieved 2020-02-11.

[Leisher_1998_KOI8-UNI-9] Leisher, Mark (2008) [1998-03-05]. "KOI8 Unified Cyrillic to Unicode 2.1 mapping table". Department of Mathematical Sciences, New Mexico State University. Retrieved 2020-05-02.