Talk:Multinational Character Set

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

restored C0; added C1 – These are part of MCS according to "VT220 Programmer Reference Manual" §2.4.1[edit]

Actually, the manual does not state this. It is explicit regarding the GL (that is - C0 is separate). It is ambiguous regarding C1. The ambiguity can be resolved by seeing that including C1 in the MCS contradicts the description of NRC. Disagreements should point to a specific statement in the manual rather than a section TEDickey (talk) 15:28, 6 October 2012 (UTC)[reply]

Actually the manual does state this. Here are specific statements from §2.4.1.
Paragraph 1: "By factory default, when you power up or reset the terminal, the DEC multinational character set (Table 2-3) is mapped into the 8-bit code matrix (columns 0 through 15)."
  • This includes columns 0–1 and 8–9 which contain the C0 and C1 codes.
  • Table 2-3 includes C0 and C1 as part of the DEC Multinational Character Set in the table headings and shows the C0 and C1 codes in the table.
Paragraph 2: "The 7-bit compatible left half of the DEC multinational set is the ASCII graphics set. The C0 codes are the ASCII control characters, and the GL codes are the ASCII graphics set."
  • This uses the phrase "ASCII graphics set" in an ambiguous way that could be interpreted as excluding C0 from the MCS. To be consistent with the paragraphs 1 and 3, I interpret this as including C0.
Paragraph 3, sentence 1: "The 8-bit compatible right half of the DEC multinational set includes the C1 8-bit control characters in columns 8 and 9."
  • This explicitly includes C1 in the MCS.
Coroboy (talk) 08:58, 23 May 2014 (UTC)[reply]

DEC-MCS is a seven bit (94-character) character set designed to be used with ECMA-35 (ISO2022). The whole ECMA-35 scheme is horribly complex with, in theory, all four of the C0, GL, C1 and GR sections independently replaceable. I don't think any physical terminal ever used the sequences that allowed replacement of either set of control characters (C0 0..32 and C1 128..159) but the standard does. The GL and GR sets, however, are replaceable the GL (33..126) can be replaced with any 94 character set and GR (160..255) can be replaced with any 94 OR 96 character set. DEC-MCS could be mapped into GL (33..126) if you wanted or if you had a really old terminal like a real VT100-AVO that doesn't have 'GR'. In addition there was another layer of complexity in that you can only map G0,G1,G2 or G3 into GL and GR. The actual character sets you want to select have to be mapped into G1..3 first. I think this way so you could have 512 character patterns in RAM and leave the hundreds of other sets compressed in ROM; but AFAIK it never worked that way.

Sane people avoided the whole mess and used codepages where 0..127 was by definition US-ASCII and 128..255 was your flavour of the month. So back in the real world C0 and C1 would be fixed by the terminal maker, GL would be US-ASCII to preserve everyone's sanity and DEC-MCS would be mapped into GR for the diacriticals. So the table with 256 characters is actually correct, despite being completely wrong. 2001:470:1F09:10D6:5E26:AFF:FE7D:2F72 (talk) 17:12, 7 November 2015 (UTC)[reply]

MCS or NRCS?[edit]

What is the difference between the MCS described here, and the national replacement character sets described in all the manuals? If its the same thing, shouldn't we change the name? Maury Markowitz (talk) 21:35, 24 January 2015 (UTC)[reply]

The explanation is given in the lede: MCS is one of several encodings provided by the NRCS feature. TEDickey (talk) 21:41, 24 January 2015 (UTC)[reply]
Actually, you're both wrong. DEC-MCS was a 'replacement' for the ASCII NRCS variants. It purposely includes all the additional characters that are in the Western European variations of ASCII. Each of those is a seven bit set that exchanges less important characters in US-ASCII with essential characters for the language in question. 2001:470:1F09:10D6:5E26:AFF:FE7D:2F72 (talk) 16:46, 7 November 2015 (UTC)[reply]

code pages vs sources[edit]

That bare statement in the lede asserts that cp1287 and cp1288 are cited in the VT220 reference manual as implementations of the character set. They are not. The mention of equivalence for the character set in a terminal emulator is interesting, but not authoritative (it's just another made-up "fact"). The statement, if kept, should be moved out of the lede and revised. TEDickey (talk) 09:29, 13 February 2017 (UTC)[reply]

0xDD[edit]

ECMA source in article says Ý. --Redeemer (talk) 20:14, 11 September 2019 (UTC)[reply]

ECMA's not the suitable source for this character set. The VT220 reference manual (EK-VT220-RM-001) table 2-3 shows U+0178 (matching this topic) TEDickey (talk) 22:17, 11 September 2019 (UTC)[reply]

0xA0[edit]

The interpretation of 0xA0 as NBSP comes from people long after the DEC MCS (read the sources) TEDickey (talk) 00:29, 15 April 2020 (UTC)[reply]

I've switched 0xA0 back to undef. DRMcCreedy (talk) 01:19, 15 April 2020 (UTC)[reply]