Talk:Extended ASCII/Archive 1

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

How are They Stored?

If you have a string of characters, how does one differentiate between two different code pages? I'd imagine that some form of escape character would be used. For example, if I'm writing using ISO-88591 and want to switch over to greek, what is the internal change, on non-unicode terminals? 66.190.72.225 07:23, 26 November 2005 (UTC)

In general, the only way to conclusively know the encoding of a sequence of bytes is with some out of band meta-information. For example, MIME and HTTP use the Content-Type header to specify the character set and encoding.

For your second question, it sounds like you want to mix characters from different character sets. That is difficult or impossible to do without Unicode. There are some encoding formats, such as ISO-2022 that permit mixing encodings, but Unicode is generally the best choice. 216.113.168.128 20:04, 21 December 2005 (UTC)

Some programs like Microsoft Internet Explorer try to "guess" the code page based on how the text looks like in each code page 84.58.183.1 21:27, 25 April 2007 (UTC)

ANSI or ASCII?

What is the difference between ANSI and Extended ASCII?

Is there a difference, or are they just the flip sides of the same thing? 216.99.201.109 (talk) 20:38, 30 October 2009 (UTC)

ANSI's not a character set (it's used to refer to the ANSI escape sequences). Tedickey (talk) 21:44, 30 October 2009 (UTC)

‘ANSI’ as used in Microsoft Windows programming means any character encoding other than Unicode (the exact meaning depending on the codepage setting and particular language version of Windows in use). A number of APIs have both A (‘ANSI’) and W (wide, i.e. Unicode) versions of functions.

I guess this is an example of double metonymy: ‘ANSI’ being used not to refer to just one of the organization’s standards (as with the X3.64 escape sequences mentioned by Mr Dickey) but to the whole kind of character encoding they might standardize (i.e. byte-level).

(Note that although the original 7-bit ASCII was an ANSI standard (X3.4), it’s unlikely to be referred to as ANSI: the ASCII name is so much better known.)

So to answer your question, they’re both vague terms that can mean vaguely the same thing. --82.46.154.229 (talk) 18:59, 3 April 2011 (UTC)

If you're going to cite Windows, a Microsoft webpage would be appropriate. Googling on "windows ansi encoding" isn't showing me anything appropriate, merely the usual uninformed comments (such as Wikipedia). TEDickey (talk) 20:47, 3 April 2011 (UTC)

Indeed. See Code Pages (Windows) and Unicode and Windows XP [PDF], which additionally give the origin of the term from the ANSI draft that became ISO 8859-1. --82.46.154.229 (talk) 02:57, 5 April 2011 (UTC)

That's addressing part of the comments above: this source equates "ANSI" with CP-1252, but doesn't generalize to the other code pages which are supported in Windows TEDickey (talk) 10:00, 5 April 2011 (UTC)

FYI, Windows_ANSI_code_page#ANSI_code_page deals with this. — Preceding unsigned comment added by 86.75.160.141 (talk) 20:22, 29 October 2012 (UTC)

I see that it talks about it, but also see that it adds comments which are not found in the given sources (seems that some editors provided their own story). Using Wikipedia instead of a reliable source isn't conducive to a discussion TEDickey (talk) 23:42, 29 October 2012 (UTC)

You are right that wikipedia is not a correct reference, but the above page provide this reference http://msdn.microsoft.com/en-us/goglobal/bb964658.aspx#a where MSDN explain why ANSI is a misnommer. 86.75.160.141 (talk) 16:08, 1 November 2012 (UTC)

Variations and extensions

As hundreds or thouthands standards do exist with many variations and common part from one to one other it is difficult to have an overview of each relationship. Next table give an illustration of how ASCII and ASCII extended and variants, have central influence in technology evolutions.

Telegraphy	Telephony	Computing					Aviation

Original Baudot code (International alphabet n°1)
↓
International alphabet n°2 (IA n°2)
		Variants of EBCDIC and other character encodings
		⇟
		ISO 646 - IRV (international reference variant)					Arinc
		↓	↓	↓	↓	↓
		ISO 646 - US (United States)
						ISO 646: Other countries
						↓
						ISO 646: Other countries
		↓	↓	↓	↓	↓
		ASCII	Code page DOS (437, 850, ...)	ISO 8859 series (for example ISO 8859-1, ISO 8859-15)	ISO 2022 (supports more than 256 characters)	↓
			⇟			↓
			Windows code page such as Windows-1252 (or Ansinew)			↓
		↓	↓	↓	↓	↓
		ISO 10646 / Unicode
↓
IA n°5	GSM 03.38 (SMS)

Légende:

Légende
ASCII	ASCII standard or standards very close from ASCII
Extended ASCII	Add additional characters to ASCII ones.
Extended ASCII	Add additional characters to ASCII ones, but ASCII bytes may represent other characters depenfing on context
ASCII variants	Mostly ASCII, but with some code points representing different characters
ASCII subset	Mostly ASCII, but with some code points reserved for national variants
Unrelated to ASCII	Not related to ASCII
⇟	New encoding; no conservation of previous set of characters
⇣	New encoding providing the set of characters yet avilable in the previous one

⇣⇟⇓⇩⥕⥥⟱⤋⬇↡

The Windows code pages could be considered ISO-8859-x with additional characters, rather than a re-encoding of the IBM code pages.Spitzak (talk) 02:28, 6 August 2014 (UTC)

No sources?

There are no sources given for "Extended ASCII" - one source is 404, the other two sources literally say there is no such thing as "Extended ASCII". At the moment, it seems wikipedia is the original source for this. 109.193.248.102 (talk) 23:45, 17 May 2022 (UTC)

If you're referring to the first three sources in the article, then 1) I updated the link to the Oracle forum posting to its current location, so no more 404s (and that one had an archive link that worked), and 2) all three of the comments are part of threads (mail, forum, or USENET) that speak of "extended ASCII" but all the comments say "don't use that term". This should not be surprising, given that they're used as a reference for the claim that "Using the term "extended ASCII" on its own is sometimes criticized...".

Given that you also proposed deleting the page, I see two questions here:

1) Should the term "extended ASCII" be used? On the one hand, the use of that term in the thread indicates that there are people who use it; on the other hand, the comments in the thread indicate that there are arguments against its use.

2) Does the concept of "character sets that encode characters as sequences of 8-bit bytes, and in which the characters in ASCII are encoded as a single 8-bit byte whose value is the code point for the character, and in which characters not in ASCII are encoded sequences of one or more 8-bit bytes in which the first byte has the uppermost bit set" deserve a Wikipedia page?

I don't see that a "no" answer to the first question requires a "no" answer to the second question:

The first reference speaks of "8-bit extensions of ASCII", by which I suspect they mean "character sets that encode characters as sequences of 8-bit bytes, and in which the characters in ASCII are encoded as a single 8-bit byte whose value is the code point for the character, and in which characters not in ASCII are encoded as an 8-bit byte with the uppermost bit set", so ISO 8859/1 is an "8-bit extension of ASCII" but various Extended Unix Code (EUC) encodings, and UTF-8, aren't.
The second reference speaks of encodings of the sort I describe, as well as of UTF-16, which uses ASCII code points to represent ASCII characters, but doesn't encode them as single 8-bit bytes.
The third reference speaks of "many, many, many different character sets designed such that ASCII is a subset of them", saying that "These may logically be regarded as extensions to ASCII, but you can't point to any one of them and say "that's Extended ASCII"."

so they all acknowledge existence of the concept of character sets that extend ASCII by adding new characters.

I think the general concept is useful, and its existence is acknowledged by the three people complaining about the term "extended ASCII", so I don't think the article should be deleted; instead, the page should remain, with a new title. Guy Harris (talk) 00:41, 18 May 2022 (UTC)

Charset table as ASCII

I removed the following section. See below for rationale. --Pjacobi 10:51, September 5, 2005 (UTC)

Extended ASCII Table

File:Exascii.jpg

Extended ASCII uses 8 digits of 1's and 0's for a total of 256 characters. The first 32 however cannot be shown as they are special control sequences and thus they cannot be printed. Also the two blanks are space characters.

IMHO this section is a bad idea for two reasons:

Giving some specific charset, whereas the article correctly states, that there are plenty.
Using graphics for a text table

10:51, September 5, 2005 (UTC)

I agree on the second reson just some of the charecters cant be displayed on the web as far as I know If anyone knows a way a regular table would be nicer. However on the first part ASCII has a table plus I believe a table could be beneficial and a major point of interest it can teach you a fair amount about computers. Or maybe a text table of 00100001 through 01111110 (! through ~ on the chart) as they are the most common used? or just 0-9 and A-Z? --Shimonnyman 11:28, 5 September 2005 (UTC)

There is an ASCII table in ASCII.

Extended ASCII tables are in ISO 8859-1, ISO 8859-2, ISO 8859-3, ISO 8859-4, ISO 8859-5, ISO 8859-6, ISO 8859-7, ISO 8859-8, ISO 8859-9, ISO 8859-10, ISO 8859-11, ISO 8859-13, ISO 8859-14, ISO 8859-15, ISO 8859-16, Code page 437, Code page 850, Code page 858, KOI8-R, KOI8-U, TSCII, Mac-Roman encoding, Kamenicky encoding (list is incomplete).

Pjacobi 13:19, September 5, 2005 (UTC)

Those arent Extended ASCII binary tables, thats what i was refering to thinking could help but the ASCII table on the ASCII page appears to be extended ASCII (a bit incomplete) I didnt check every code but every one i checked was extended ASCII and they arent the same in both because well obviously 7-bit and 8-bit arent going to look identicle. So maybe it belongs here, I dont know anyways just observing. — Preceding unsigned comment added by Shimonnyman (talk • contribs) 23:52, 5 September 2005 (UTC)

"A table" of "extended ASCII" would be a table showing ASCII plus a bunch of "available for use when encoding other characters" slots; it cannot show any characters other than those in ASCII, because different extensions of ASCII have different characters, and thus would have different tables. That's exactly what User:Pjacobi said. Guy Harris (talk) 00:57, 18 May 2022 (UTC)

Old Layout this Article Only

It appears that this article (Extended ASCII) has the old Wikipedia layout and not the new one launched in January of 2023.

Is there a way to fix it? Ducktapeonmydesk (talk) 21:37, 26 January 2023 (UTC)

There might have been a cached copy on some Wikimedia server, and your two edits might have caused the cached copy to be flushed. There's also a "Purge" menu item that will purge cached copies of the article; it's under "Tools" in the new skin, and in whatever drop-down list contained "Move" in the old skin. Guy Harris (talk) 22:11, 26 January 2023 (UTC)