Talk:Algorithmic information theory

This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer scienceWikipedia:WikiProject Computer scienceTemplate:WikiProject Computer scienceComputer science articles

High

This article has been rated as High-importance on the project's importance scale.

Things you can help WikiProject Computer science with:

Here are some tasks awaiting attention:

Article requests :
- Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup :
- Computer science articles needing attention
- Computer science articles needing expert attention
Copyedit :
- Computing
Expand :
- Computer science
Infobox :
- Computer science articles without infoboxes
Maintain :
- Timeline of computing 2020–present
Photo :
- Find pictures for the biographies of computer scientists (see List of computer scientists)
- Computing articles needing images
Stubs :
- Computer science stubs
Unreferenced :
- WikiProject Computer science/Unreferenced BLPs
Project-related :
- Tag all relevant articles in Category:Computer science and sub-categories with {{WikiProject Computer science}}

Cognitive science (inactive)

This article is within the scope of WikiProject Cognitive science, a project which is currently considered to be inactive.Cognitive scienceWikipedia:WikiProject Cognitive scienceTemplate:WikiProject Cognitive scienceCognitive science articles

Issues[edit]

The following remarks pertain to the article Algorithmic Information Theory. The article as it stands has some issues that I would like to bring to discussion. I hope that some consensus can be reached about how to deal with them.

There are no references.
Komogorov complexity is an active research area in mathematics as well as computer science.
The statement

Unlike classical information theory, algorithmic information theory gives formal,
rigorous definitions of a random string.....

is a point of view that is not universally shared, although it has been championed by Chaitin in popularizations of the area. It is well known that the AIT definition of random string is actually only asymptotically well defined, so the viewpoint that AIT does not give a formal definition of a random string is equally valid (and, among mathematicians I have spoken with, a common viewpoint). The issue is the lack of a canonical choice of a universal machine.

The definition of random infinite sequence has the same extent as the one given by Martin Lof, and this should be mentioned.

CMummert 04:37, 16 June 2006 (UTC)[reply]

This is a stub. If you want to add more content or more references, feel free. However, it is a fact that classical information theory gives no definition of a random string or random infinite sequence (nor does it make any attempt, focusing as it does on random processes), whereas AIT does. Maybe there are those who don't like the definitions for aesthetic reasons, but they exist nonetheless, and have numerous applications in theoretical computer science, e.g.

the result of Allender et al (Power from Random Strings) that PSPACE ⊆ P^R, where R is the set of Kolmogorov random strings
the incompressibility method to prove lower bounds such as the average case complexity of Heapsort as described in Li and Vitanyi (An Introduction to Kolmogorov Complexity and its Applicaions)
many, many others

The definition of a random string is formal, so long as one accepts that Turing machines are formal mathematical objects, and it is rigorous and robust, as the above mentioned results, and many others, do not depend on the choice of universal machine. If you are aware of results in pure mathematics that demonstrate that the definition of a random string or sequence is frail, please add these results. The notion that these definitions are robust to the choice of universal machine predates Chaitin. This robustness is due to the invariance theorem (Lemma 2.1.1 in Li and Vitanyi), first proven by Solomonoff in 1964. Finally, I'm afraid I don't understand your final point:

The definition of random infinite sequence has the same extent as the one given by Martin Lof,
and this should be mentioned.

Is there another definition you are thinking of? (e.g. Schnorr randomness?) Pexatus 22:16, 17 June 2006 (UTC)[reply]

Yes the definition of a Komogorv random string is rigorous, but only once you have fixed some universal Turing machine (or universal prefix-free machine). There is no definition which in any canonical manner declares that particular finite strings are random while others aren't. By uncanonical I mean that any fixed string is random with respect to some universal machines and nonrandom with respect to other machines (by Kraft's theorem, any string can be given any Kolmogorov complexity). For particular results that use the existence of random strings, such as the incompressibility method, the lack of canonicity is not important. But it is important that the article doesn't claim that there is a way to canonically say that particular strings are random and others are not. That is, the theory does not lead to an unambiguous answer to the question Is the string 10101010111 random?'

So the quote

Unlike classical information theory, algorithmic information theory gives formal,
rigorous definitions of a random string.....

Should say

Unlike classical information theory, algorithmic information theory gives a formal,
rigorous definition of a random string, although this definition requires a noncanonical 
choice of a universal Turing machine.

or something like that. I would like to come to a consensus on the talk page about the best way to phrase it before editing the article.

Kolmogorov complexity does give a canonical definition of a random infinite sequence, because any two universal machines give the same asymptotic complexities up to a constant. This definition of random infinite sequence gives exactly the same collection of random infinite sequences as the definition of a random infinite sequence due to Martin Löf, which says that a infinite sequence is random if it does not lie in any effective measure 0 set. The fact that these two definitions agree is usually presented as part of the motivation for these definitions. CMummert 02:14, 18 June 2006 (UTC)[reply]

The way this is usually phrased in my research community (theoretical computer science) is that the Kolmogorov complexity of a string is well-defined up to a constant depending on the choice of universal machine. Maybe the sentence could be

Unlike classical information theory, algorithmic information theory gives a formal, 
rigorous definition of a random string. The set of random strings depends on the choice
of the universal Turing machine used to define Kolmogorov complexity, but any choice
gives identical asymptotic results, since the Kolmogorov complexity of a string is
invariant up to an additive constant depending only on the choice of universal Turing 
machine.

All the results I know of are asymptotic, even when dealing with finite strings, since we are making claims about infinite sets of strings, OR they are statements about individual strings that begin with "There is a constant c such that...". For instance, in the example above from Allender et al, the set of random strings changes with the universal machine, but any of these sets has sufficient power to decide all of PSPACE in polynomial time. Therefore, even though there are an infinite number of ways to define the set of Kolmogorov random strings, every one of these definitions results in the exact same theory of algorithmic information, including finite string results. And there are those who believe that there is in fact a canonical universal machine, the one which requires the fewest bits to encode, and that this number is a fundamental constant of the universe every bit as important as the charge on an electron or Avogadro's Number.

The multiple equivalent definitions of random infinite sequences are discussed on the page for algorithmically random sequences. If you are going to edit this article, make sure you look at algorithmically random sequence and Kolmogorov complexity to make sure not to introduce too much overlap between the articles. Pexatus 02:52, 18 June 2006 (UTC)[reply]

A second quibble[edit]

The sentence that says Omega is

a real number which expresses the probability that 
a random computer program will eventually halt.

is not literally correct. There is no nontrivial measure on the set of programs, and thus there is no formal sense to the phrase 'random computer program'. What is true is that Omega gives the probability that a random infinite binary sequence has some initial string that is the index of a program that halts.

To see what is wrong with the phrase `random computer program' consider the supposed experiment in which a coin is flipped until the index of a valid program has been obtained, at which point the flipping stops. What if no such program is ever obtained? It can be shown that this is possible for any prefix free universal machine. Thus the coin flipping procedure is not a valid probability experiment. If you were to compute the conditional probability only relative to those infinite sequences that begin with the index of a program then the conditional probability you get would not, in general, be equal to Omega. It would be Omega divided by the measure of the set of infinite sequences that start with the index of a program. CMummert 02:20, 18 June 2006 (UTC)[reply]

The stuff about Chaitin seems to be hastily written. I'm sure this could be cleaned up (and references to other researchers besides Chaitin added). A better way to phrase that would be

Omega is a real number which expresses the probability that a self-delimiting
universal Turing machine will halt when its input is supplied by flips
of a fair coin.

The event space is the set of all infinite sequences that could be pre-placed on the input tape, and the event "U halts while reading the input tape supplied" is the event with probability Omega.

OK?[edit]

Here are slightly edited version extending Pexatus's proposed sentences. Are they OK with everyone?

Unlike classical information theory, algorithmic information theory gives a formal, 
rigorous definition of a random string. (The set of random strings depends on the choice
of the universal Turing machine used to define Kolmogorov complexity, but any choice
gives identical asymptotic results because the Kolmogorov complexity of a string is
invariant up to an additive constant depending only on the choice of universal Turing  
machine.)

Ω is a real number which expresses the probability that a self-delimiting
universal Turing machine will halt when its input is supplied by flips
of a fair coin.  Athough Ω is easily defined, it is impossible to determine 
infinitely many of its binary digits in any consistent axiom system; this 
result is similar to Gödel's Incompleteness Theorem in establishing the
unprovability of certain statements in formal theories.  Although the digits of Ω 
cannot be determined, many properties of Ω are known; for example, it is an 
algorithmically random sequence and thus its binary digits are equidistributed.
From the point of view of computability theory, Omega is of the same Turing degree
as the halting problem.

Pexatus: thanks for pointing out algorithmically random sequence. I would never have found it with that name. They are called 1-random reals in computability theory. CMummert 12:05, 20 June 2006 (UTC)[reply]

Looks good. I personally think that it is easier to see that Ω is uncomputable because access to it allows one to solve the halting problem, but since I'm not a logic guy, I guess I have a different notion of what is easy to see. I would remove the part about its digits being equidistributed (i.e., it is simply normal), as that might give the false impression that that is the definition of a random sequence. At some point, self-delimiting Turing machine should be given its own entry and linked from here. (a Turing machine with a read-only input tape whose read head may only move to the right, and which is defined such that its computation is invalid if the read head is not on the final character of the input sequence when the machine halts). Pexatus 18:35, 20 June 2006 (UTC)[reply]

Thanks for the feedback. I'll cut the equidistributed part (the whole clause starting with and thus). The Turing degree is the easiest way for me to see that Omega is uncomputable, as well, although I don't know any really easy proof. CMummert 22:49, 20 June 2006 (UTC)[reply]

I edited the page using these suggestions as guidelines, since it's been two years and no action had been taken. - skeptical scientist (talk) 16:29, 9 July 2008 (UTC)[reply]

"Information" misleading, means compressibility?[edit]

The article now states "This use of the term "information" might be a bit misleading, as it depends upon the concept of compressibility."

Is this fair? I'm not sure if it does depend on compressibility. In the case of infinite strings, there are equivalent definitions using betting strategies and constructive null sets which make no mention of compressibility, and even in the finite case there is an equivalent definition using termgales. Furthermore, I think there are good reasons to use compressibility and information interchangeably, as different representations of the same piece of data should have the same information content, and no piece of data can contain more information than its most compact representation. The use of the term "information" may be misleading, but I think this has more to do with the distinction between "information" and "useful information" which is described elsewhere. skeptical scientist (talk) 14:38, 21 July 2008 (UTC)[reply]

Utility is an economical concept[edit]

IMHO, useful information refer to an economical concept, not a computer science one. First, there is a cost associated to processing information, and there is a trade off to make with space / time / energy. Kolmogorov Complexity is the minimum size, given you are willing to spend whatever amount of your time computing, it's an extrema of a space/time function. A random sequence could be seen as a fully (lossless) compressed one, therefore it cost time and energy to uncompressed it before using it somehow. The operation may be far from the optimal cost, so useless. Second, the information could have no utility because it can't be processed, like a text written in Chinese is useless to me alone. All of this utility consideration are relative to a context unrelated to the theory.77.192.56.161 (talk) 10:33, 19 October 2010 (UTC)a student[reply]

Lack of references[edit]

The last section, in particular the last paragraph needs references : "AC, AP, and AR are the core sub-disciplines of AIT, but AIT spawns into many other areas. It serves as the foundation of the Minimum Description Length (MDL) principle, can simplify proofs in computational complexity theory, has been used to define a universal similarity metric between objects, solves the Maxwell daemon problem, and many others." — Preceding unsigned comment added by 2A02:2788:925:E21B:91A9:A15:BAF:C904 (talk) 23:37, 17 February 2021 (UTC)[reply]

Scholarpedia[edit]

The Scholarpedia article on this topic was reviewed by Paul Vitanyi, which is sufficient to make it a notable and (at least when Vitanyi reviewed it) reliable resource. Thus, I have restored it as an external link. Kiefer.Wolfowitz (talk) 12:37, 10 February 2011 (UTC)[reply]

Other errors[edit]

There appear to be at least two outstanding errors in this article http://en.wikipedia.org/wiki/Algorithmic_information_theory

First,

'From this point of view, a 3000 page encyclopedia actually contains less information than 3000 pages of completely random letters'

is not completely true. In point of fact one can plot words or the sounds of a word made up of letters, to determine if said sounds represent actual language, versus pure jibberish. Therefore, written language cannot be said to contain random information, such as random letters. Even if you jumbled the letters of the encycopedia, then plotted their numbers, specific letters would occur periodically more often than others based on use in language, therefore they are not random.

Secondly,

'1100100001100001110111101110110011111010010000100101011110010110 presumably has no simple description other than writing down the string itself'

This above statement is also not entirely true. A program or algorithm can convert the binary numbers to octal, hexadecimal, base-10, or any other base numerical system, including expressing the above string of 1's and 0's in base 2^n groupings, for example. Such alternate representation can also be algorithmic expression of the binary string. — Preceding unsigned comment added by Contributions/ ([[User talk:|talk]]) 14:14, 2 August 2013 (UTC)