Talk:Code point

Code point <-> character[edit]

My edit was just reverted. I'm going to put it back but change "the common case" to "many code points".

The introduction of the article needs a simple explanation of the concept that is understandable without background knowledge and giving "character" as an example provides a good intuition. -- Liiiii (talk) 14:04, 20 April 2014 (UTC)[reply]

jha 2A02:1810:4812:FA00:B533:DC33:D95F:418F (talk) 12:45, 28 March 2023 (UTC)[reply]

ASCII example[edit]

The ASCII example is unsuitable since ASCII does not used the code point/encoding distinction. It should be replaced be something else. — Preceding unsigned comment added by Liiiii (talk • contribs) 08:46, 21 April 2014 (UTC)[reply]

True, it doesn't use the code point/encoding distinction but that isn't critical to the description of the concept of "code points". Codepoints are espcially handy for systems like Unicode that have multiple encodings, but codepoints do exist in system that only use one encoding.

(Informally, in fact there are multiple ASCII encodings - there's 6 bit ASCII, 7 bit ASCII with even, odd, and no parity, extended 8-bit ASCII, etc. Even UTF-8 can be considered to be an extended form of ASCII.) Dave92F1 (talk) 00:15, 27 August 2023 (UTC)[reply]

Codepoints are not just about Unicode or characters[edit]

The concept of codepoints (or code points) long predates Unicode (and ASCII) and is much more general than characters. This article makes it sound like it's a concept from Unicode.

In fact a codepoint is a unique point in a quantized n-dimensional space, which is assigned a semantic meaning.

In other words, it's a table entry that has been filled in with something. The table has discrete positions (1, 2, 3, 4, but not fractions) and may be one dimensional (a column), two dimensional (like cells in a spreadsheet), three dimensional (sheets in a workbook), etc... in any number of dimensions.

It is used to convey meanings. ASCII is uses a simple 1D codepoint table - 0x20 is a space, 0x41 is 'A', etc.

But the concept is used in a multitude of formal information processing standards, in which codepoints are assigned meanings. For example ITU-T Recommendion T.35 (https://www.itu.int/rec/T-REC-T.35-200002-I/en) is a set of country codes for fax machines (and other things). Each country is assigned a codepoint, so that when a fax machine calls, it can identify which country it's calling from. Argentina is 0x03, Canada is 0x20, Gambia is 0x41, etc.

These are codepoints. They have nothing to do with character encoding. — Preceding unsigned comment added by Dave92F1 (talk • contribs) 02:42, 20 August 2018 (UTC)[reply]

Agree. We should distinguish the use of term code point for both character and objects associations. Do we have proper sources covering this difference btw? AXONOV (talk) ⚑ 10:54, 21 January 2022 (UTC)[reply]

Codepoint vs abstract characters[edit]

This page is quite confusing and claims the Unicode standard does not explain the difference between abstract characters and codepoints, yet it does so much more clearly than this article, which confusingly brings up codepages when discussing Unicode:

The distinction between a code point and the corresponding abstract character is not pronounced in Unicode, but is evident for many other encoding schemes, where numerous code pages may exist for a single code space.

The statement above is especially confusing as code pages both define a code space and a mapping for code points to abstract characters.

The clear explanation can be found here "Unicode Standard 8.0.0., 2.4 Code Points and Characters": http://www.unicode.org/versions/Unicode8.0.0/UnicodeStandard-8.0.pdf#G7.25564 78.124.136.51 (talk) 06:56, 26 January 2023 (UTC)[reply]