josephzizys + all_posts 4
The Meaning of Omega
december 2011 by josephzizys
And the meaning of the recently-broken metric TSP approximation ratio
Andrew Stothers wrote in 2007 a first-year qualifying report for the University of Edinburgh PhD programme. In only eleven pages he presented the parts of the famous paper by Don Coppersmith and Shmuel Winograd (CW) that do enough to obtain for the exponent of matrix multiplication. In a few more pages he surveyed recent work on sufficient conditions for . Page 15, the last except for references, laid out “Possible Directions for Future Work” in five sentences, the fifth being:
Finally, it will also be possible to investigate bilinear algorithms further and use more complex means to determine whether or not “better” algorithms exist for matrix multiplication.
Today we wish to explain what means, why lower is always “better,” why we feel our field is better for the great efforts expended by Stothers and Virginia Vassilevska Williams, and why it is better when intellectual values guide our judgment. We also wish to defend ourselves and others for calling the results “breakthroughs,” comparing and contrasting to recent advances on algorithms for the Traveling Salesman Problem.
I (Ken) wrote a similar first-year qualifier in summer 1982 after my first year at Oxford—they are a staple of leading British universities. Like Andrew’s it was not obligated to have original research, just to demonstrate understanding and capability. One item found its way into a conference paper, but my 1986 dissertation ranged into various other topics.
What distinguishes Andrew’s Nov. 2010 dissertation is its singular focus on accomplishing the objective of his proposal. And doing it. The resulting paper with his advisor Sandy Davie has recently been submitted. Meanwhile in North America, Virginia had been thinking about the same problem since 2005. It may not be true as asserted here that “hundreds of people tried to improve” it after 1987, but we’ll guess about as many people thought about it. It’s enough to note that many researchers felt it worthy of pursuit, including Andrew’s advisor, and that progress took large concerted effort. The nature of this effort—including the use of computers—may be an important harbinger for science by itself.
What Does ω Mean?
The definition of is the infimum of all such that matrix multiplication has an algorithm that runs in time . This is in a classical model of computation where addition and multiplication of scalars have unit cost. Note that there might not be an algorithm that runs in time itself— may properly be a limit even if it turns out to have the least possible value 2. Currently all we know is that from Virginia’s work.
As many have noted, does not have immediate practical relevance because there is evidently a tradeoff with the constant hiding under the in . The tradeoff is so far steep enough that apparently only Volker Strassen’s relatively simple algorithm achieving an exponent of has wide use. So why do we care about values of less than that?
One of the central questions in physics is, why do the fundamental constants of nature have the values they do? The desire for why, in the face of evidence that the values could be arbitrary, has aroused such passion that string theorist David Gross channeled Winston Churchill in exhorting young physicists to “never, never, never give up.”
Now in mathematics it may seem nonsensical to ask, why do numbers like and have the values they do? But with , we are in a sense closer to physics: understanding it is akin to discovering a natural law, at least a law of information. And quantum mechanics has if anything magnified William Hamilton’s vision approaching 200 years ago that Nature computes with matrices, yes large ones. It is possible that detracts from a better statement of important laws, but knowing better brings us closer to them all the same.
Barriers and Breakthroughs
Hence also the interest in whether takes a value with known meaning, such as or (both falsified) or , or as some have mused, or , or as some have argued more likely for exponents, a logarithm of some higher and simpler number. Strassen himself has often stated his belief that is strictly greater than . If it has a value not previously seen in mathematics, then we could hope to discover new mathematical regularities as well; if it has a known value, then this may yield some more explanation about algorithms.
The value seemed a natural possible barrier, but the time interval from to in 1981–82 was very short. By contrast the “CW” bound of was not a natural-seeming number, but it withstood attempts to scale it for 23 years, including a year’s work from each of Virginia and Andrew. Reasons of intellectual judgment, expanded below, we feel are enough to justify calling their final results a “breakthrough.” But we offer a recent concrete case for comparison first.
The Salesman Always Rings 1.5
The Traveling Salesman Problem (TSP) seeks the shortest route to visit each of a given set of sites, coming back in a ring to the starting site. When the sites are in real space and “shortest” refers to Euclidean distance, finding the length of the absolute shortest route remains -hard even in the plane, as discussed here. However, Nicos Christofides in 1976 found a polynomial-time algorithm that finds a tour guaranteed to have total length at most . The method and proof apply to various other cases where the distances between sites obey inequalities that define some kind of metric, with the same bound.
This stood as a barrier for most of these cases, until this year. Shayan Oveis Gharan, Amit Saberi, and Mohit Singh (OSS) broke it for a wide subclass of these problems. Well they “broke” it by achieving a guaranteed ratio to the optimum of
Actually their original paper did not give a value—it merely proved the existence of an giving . The estimate for the magnitude of their breakthrough was supplied later.
Improvements and Predictions
It must be said that the OSS paper and its predecessor introduced techniques that were more novel to the problem than is evident for the new matrix-multiplication papers, and the predecessor won the SODA 2010 “Best Paper” award. However, a referee could have wondered, why bother moving heaven and earth in a restricted case to demonstrate an increment a thousand times smaller than “nano”? Indeed the version of OSS linked above still ends with “” on page 65.
What happened is that the change attracted interest on several continents, and the people in Europe who made a bankable improvement are not even in the euro zone. First Tobias Mömke and Ola Svensson of KTH in Stockholm obtained by using matchings in place of the more-complicated ingredients of OSS, also in time for FOCS 2011. Then Marcin Mucha of the University of Warsaw improved their analysis to obtain
It is noted by all that an even older paper by Michael Held and Richard Karp from 1970 had set a believable target for provable improvements of Christofides’ bound, namely a conjectured ratio of from linear programming relaxations. However, also has dueling conjectures, some supporting a believable target of . The real point of comparison with TSP is that suddenly there was progress on a barrier that had stood for much of the age of the field, and this attracted others to try for more.
We note that Markus Bläser has contended in a comment that the extension of CW used by both new papers has limitations, and we infer that some other experts concur. However, the paper by Vassilevska Williams in particular has two new ingredients: a framework for managing any tensor power as a base algorithm, and a computer implementation of the framework. She also employed an insight of Stothers to simplify the analysis. Though we can imagine a geometrical insight that higher powers bring returns that “shrink geometrically,” can we really constrain the likelihood that pursuing them might dislodge a different insight that changes the whole game? Is there a limitation theorem here? All we know is that the game has changed, and perhaps the game is afoot.
Worth and Values
The last issue we wish to raise is the reliability of judging the worth of past achievement by present assessment of what it may or may not lead to in the future. We have posted several times on surprises and the difficulty of predicting or guessing how things will go. Of course impetus into the future is necessarily part of claiming something now a “breakthrough.”
However, if we seek a reliable and consistent standard of judging worth, we should use a value that the community has understood for many years: intellectual substance and effort. This value, plus the simple salience of the goal, led our principals to invest the years of work in the first place. Depth of thinking is our gold standard, while applicability is our paper currency. The improvements of and on paper for amid other issues may be a poor return now, but the new computer-assisted vein to mine may parlay true value later.
Dick and I hope this explains why fundamental effort and easily-stated achievement, on a fundamental problem after nearly a quarter-century of stasis, elicited the reaction it has. We agree with, and have tried to extend, Timothy Gowers’ comments here. As for how these values can be invested, only time will be the teller.
Open Problems
Can we be more open about the value of pursuing problems?
What is ?
Suppose we know something special about two matrices and to be multiplied, something not so obvious like their being sparse. Suppose in particular that we have a tiny circuit that for any outputs the value , and similarly for , where those values come from a fixed finite set. Or suppose we know that and/or preserve a similar succinctness property from argument vectors to values—which[…]
All_Posts
Ideas
News
Open_Problems
People
Proofs
Results
Algorithms
approximation
matrix
multiplication
omega
TSP
from google
Andrew Stothers wrote in 2007 a first-year qualifying report for the University of Edinburgh PhD programme. In only eleven pages he presented the parts of the famous paper by Don Coppersmith and Shmuel Winograd (CW) that do enough to obtain for the exponent of matrix multiplication. In a few more pages he surveyed recent work on sufficient conditions for . Page 15, the last except for references, laid out “Possible Directions for Future Work” in five sentences, the fifth being:
Finally, it will also be possible to investigate bilinear algorithms further and use more complex means to determine whether or not “better” algorithms exist for matrix multiplication.
Today we wish to explain what means, why lower is always “better,” why we feel our field is better for the great efforts expended by Stothers and Virginia Vassilevska Williams, and why it is better when intellectual values guide our judgment. We also wish to defend ourselves and others for calling the results “breakthroughs,” comparing and contrasting to recent advances on algorithms for the Traveling Salesman Problem.
I (Ken) wrote a similar first-year qualifier in summer 1982 after my first year at Oxford—they are a staple of leading British universities. Like Andrew’s it was not obligated to have original research, just to demonstrate understanding and capability. One item found its way into a conference paper, but my 1986 dissertation ranged into various other topics.
What distinguishes Andrew’s Nov. 2010 dissertation is its singular focus on accomplishing the objective of his proposal. And doing it. The resulting paper with his advisor Sandy Davie has recently been submitted. Meanwhile in North America, Virginia had been thinking about the same problem since 2005. It may not be true as asserted here that “hundreds of people tried to improve” it after 1987, but we’ll guess about as many people thought about it. It’s enough to note that many researchers felt it worthy of pursuit, including Andrew’s advisor, and that progress took large concerted effort. The nature of this effort—including the use of computers—may be an important harbinger for science by itself.
What Does ω Mean?
The definition of is the infimum of all such that matrix multiplication has an algorithm that runs in time . This is in a classical model of computation where addition and multiplication of scalars have unit cost. Note that there might not be an algorithm that runs in time itself— may properly be a limit even if it turns out to have the least possible value 2. Currently all we know is that from Virginia’s work.
As many have noted, does not have immediate practical relevance because there is evidently a tradeoff with the constant hiding under the in . The tradeoff is so far steep enough that apparently only Volker Strassen’s relatively simple algorithm achieving an exponent of has wide use. So why do we care about values of less than that?
One of the central questions in physics is, why do the fundamental constants of nature have the values they do? The desire for why, in the face of evidence that the values could be arbitrary, has aroused such passion that string theorist David Gross channeled Winston Churchill in exhorting young physicists to “never, never, never give up.”
Now in mathematics it may seem nonsensical to ask, why do numbers like and have the values they do? But with , we are in a sense closer to physics: understanding it is akin to discovering a natural law, at least a law of information. And quantum mechanics has if anything magnified William Hamilton’s vision approaching 200 years ago that Nature computes with matrices, yes large ones. It is possible that detracts from a better statement of important laws, but knowing better brings us closer to them all the same.
Barriers and Breakthroughs
Hence also the interest in whether takes a value with known meaning, such as or (both falsified) or , or as some have mused, or , or as some have argued more likely for exponents, a logarithm of some higher and simpler number. Strassen himself has often stated his belief that is strictly greater than . If it has a value not previously seen in mathematics, then we could hope to discover new mathematical regularities as well; if it has a known value, then this may yield some more explanation about algorithms.
The value seemed a natural possible barrier, but the time interval from to in 1981–82 was very short. By contrast the “CW” bound of was not a natural-seeming number, but it withstood attempts to scale it for 23 years, including a year’s work from each of Virginia and Andrew. Reasons of intellectual judgment, expanded below, we feel are enough to justify calling their final results a “breakthrough.” But we offer a recent concrete case for comparison first.
The Salesman Always Rings 1.5
The Traveling Salesman Problem (TSP) seeks the shortest route to visit each of a given set of sites, coming back in a ring to the starting site. When the sites are in real space and “shortest” refers to Euclidean distance, finding the length of the absolute shortest route remains -hard even in the plane, as discussed here. However, Nicos Christofides in 1976 found a polynomial-time algorithm that finds a tour guaranteed to have total length at most . The method and proof apply to various other cases where the distances between sites obey inequalities that define some kind of metric, with the same bound.
This stood as a barrier for most of these cases, until this year. Shayan Oveis Gharan, Amit Saberi, and Mohit Singh (OSS) broke it for a wide subclass of these problems. Well they “broke” it by achieving a guaranteed ratio to the optimum of
Actually their original paper did not give a value—it merely proved the existence of an giving . The estimate for the magnitude of their breakthrough was supplied later.
Improvements and Predictions
It must be said that the OSS paper and its predecessor introduced techniques that were more novel to the problem than is evident for the new matrix-multiplication papers, and the predecessor won the SODA 2010 “Best Paper” award. However, a referee could have wondered, why bother moving heaven and earth in a restricted case to demonstrate an increment a thousand times smaller than “nano”? Indeed the version of OSS linked above still ends with “” on page 65.
What happened is that the change attracted interest on several continents, and the people in Europe who made a bankable improvement are not even in the euro zone. First Tobias Mömke and Ola Svensson of KTH in Stockholm obtained by using matchings in place of the more-complicated ingredients of OSS, also in time for FOCS 2011. Then Marcin Mucha of the University of Warsaw improved their analysis to obtain
It is noted by all that an even older paper by Michael Held and Richard Karp from 1970 had set a believable target for provable improvements of Christofides’ bound, namely a conjectured ratio of from linear programming relaxations. However, also has dueling conjectures, some supporting a believable target of . The real point of comparison with TSP is that suddenly there was progress on a barrier that had stood for much of the age of the field, and this attracted others to try for more.
We note that Markus Bläser has contended in a comment that the extension of CW used by both new papers has limitations, and we infer that some other experts concur. However, the paper by Vassilevska Williams in particular has two new ingredients: a framework for managing any tensor power as a base algorithm, and a computer implementation of the framework. She also employed an insight of Stothers to simplify the analysis. Though we can imagine a geometrical insight that higher powers bring returns that “shrink geometrically,” can we really constrain the likelihood that pursuing them might dislodge a different insight that changes the whole game? Is there a limitation theorem here? All we know is that the game has changed, and perhaps the game is afoot.
Worth and Values
The last issue we wish to raise is the reliability of judging the worth of past achievement by present assessment of what it may or may not lead to in the future. We have posted several times on surprises and the difficulty of predicting or guessing how things will go. Of course impetus into the future is necessarily part of claiming something now a “breakthrough.”
However, if we seek a reliable and consistent standard of judging worth, we should use a value that the community has understood for many years: intellectual substance and effort. This value, plus the simple salience of the goal, led our principals to invest the years of work in the first place. Depth of thinking is our gold standard, while applicability is our paper currency. The improvements of and on paper for amid other issues may be a poor return now, but the new computer-assisted vein to mine may parlay true value later.
Dick and I hope this explains why fundamental effort and easily-stated achievement, on a fundamental problem after nearly a quarter-century of stasis, elicited the reaction it has. We agree with, and have tried to extend, Timothy Gowers’ comments here. As for how these values can be invested, only time will be the teller.
Open Problems
Can we be more open about the value of pursuing problems?
What is ?
Suppose we know something special about two matrices and to be multiplied, something not so obvious like their being sparse. Suppose in particular that we have a tiny circuit that for any outputs the value , and similarly for , where those values come from a fixed finite set. Or suppose we know that and/or preserve a similar succinctness property from argument vectors to values—which[…]
december 2011 by josephzizys
Another Annoying Open Problem
november 2011 by josephzizys
The k-server problem: some progress, still wide open
Aleksander Mądry gave a stellar talk at our ARC Theory Day a week ago Friday. He is an expert on algorithmic graph theory, among other areas of theory. Already he has multiple best-paper awards, including the paper of this talk from FOCS 2011, and I expect he will get more in the future. This is good going considering that basic LaTeX systems lack a macro for the Polish ogonek diacritic in his name—Ken installed a special package called tipa to get it while editing this post.
Today I would like to talk about his result and the open questions that remain.
He with Nikhil Bansal, Niv Buchbinder, and Seffi Naor (BBMN) have made a recent important contribution to our understanding of the complexity of on-line algorithms. Aleksander’s talk was wonderful and explained what is known, their new result, and what still remains to be done. As he said:
One of the remaining open questions is really annoying—it should be solved—but it continues to resist attacks.
Before I discuss their result I would like to thank the Theory Day organizers Prasad Tetali and Prasad Raghavendra. It was held on the magical day 11/11/11, which does not happen that often, and in fact Mądry’s talk spanned 11:11:11am. Ken wore a red shirt in Buffalo but could not find the WW I remembrance poppy pin he acquired during his sabbatical in Canada. The full scorecard of talks and titles was:
Thomas Dueholm Hansen: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm.
Aleksander Mądry: Online algorithms and the k-server conjecture.
Mohit Singh: A Randomized Rounding Approach for Symmetric TSP.
Ryan Williams: Algorithms for Circuits and Circuits for Algorithms.
The Problem
The problem he spoke about is the now classic on-line server problem. The simplest case is the problem of managing a computer $. Note that this is the way to tell a theory talk on computer caches from an architecture talk on computer caches—in architecture often $ is used to denote a cache. Perhaps this is changing, given the current economic turmoil, perhaps not.
The key is that there are pages in the cache and total pages. Each time a page is requested, if it is not in the cache the task is to decide which page to “evict” from the cache. If the strategy is deterministic, then as Danny Sleator and Bob Tarjan proved back in 1985, the best strategy is only -competitive. This means that the best deterministic strategy could be as bad as times the best off-line strategy which is allowed to see all the page requests at once.
Randomization comes to the rescue. If the strategy for eviction can be random, then it is known that an on-line random strategy exists that is competitive. This is due to Amos Fiat, Dick Karp, Mike Luby, Lyle McGeoch, Danny Sleator, and Neal Young in 1991.
The paging problem is just a special case of the general -server problem. In the general case the simple cache mechanism is replaced by one based on an arbitrary finite metric space. The servers at any step are located somewhere on the points of the metric space. The requests are to “serve” a point in the space: a server must move to that point, and it incurs a cost of the distance in the metric.
In the 1990′s this problem was very “hot,” and there was a stream of results that tried to get good deterministic server strategies. The first results were exponential in but independent of : recall there are servers and the space has points. Finally—in a famous paper—Elias Koutsoupias and Christos Papadimitriou 1994 showed that there is a strategy achieving , which remains today the best known result.
The BBMN Result
The new result is:
Theorem 1 There exists an -competitive on-line strategy for the -server problem.
This is a great improvement, but it also changes the rules. It is great since the competitiveness bound is poly-log in . However, all the previous results we discussed had competitiveness bounds that did not depend on the size of the metric space, . Their result does.
See their paper for the proof. The key idea is that they: Reduce the -server problem over arbitrary metric to a (more difficult) problem over a very simple metric The very simple metric is a type of tree-like metric. The proof relies on the fact that any metric space on points can be well approximated by such a metric space. The cost of the approximation grows roughly as , so it is not surprising that their theorem has factors of in the competitiveness bound.
Open Problems
The -server problem has the following bounds and gaps in those bounds:
If the strategy is deterministic it must take and can be done in . There is a small, but annoying gap of two here. Which is correct?
If the strategy is randomized it must take . There is a huge and annoying gap to the known upper bounds here. There is no known strategy for even the simple case of the real line that is randomized but beats linear in . What is going on?
Finally, are the polylog factors of in their result necessary? In particular can one get rid of the dependence on altogether?
[fixed typo in Fiat et al. to O(log k)]
1
All_Posts
News
Open_Problems
Results
Algorithms
approximation
deterministic
lower_bounds
randomness
from google
Aleksander Mądry gave a stellar talk at our ARC Theory Day a week ago Friday. He is an expert on algorithmic graph theory, among other areas of theory. Already he has multiple best-paper awards, including the paper of this talk from FOCS 2011, and I expect he will get more in the future. This is good going considering that basic LaTeX systems lack a macro for the Polish ogonek diacritic in his name—Ken installed a special package called tipa to get it while editing this post.
Today I would like to talk about his result and the open questions that remain.
He with Nikhil Bansal, Niv Buchbinder, and Seffi Naor (BBMN) have made a recent important contribution to our understanding of the complexity of on-line algorithms. Aleksander’s talk was wonderful and explained what is known, their new result, and what still remains to be done. As he said:
One of the remaining open questions is really annoying—it should be solved—but it continues to resist attacks.
Before I discuss their result I would like to thank the Theory Day organizers Prasad Tetali and Prasad Raghavendra. It was held on the magical day 11/11/11, which does not happen that often, and in fact Mądry’s talk spanned 11:11:11am. Ken wore a red shirt in Buffalo but could not find the WW I remembrance poppy pin he acquired during his sabbatical in Canada. The full scorecard of talks and titles was:
Thomas Dueholm Hansen: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm.
Aleksander Mądry: Online algorithms and the k-server conjecture.
Mohit Singh: A Randomized Rounding Approach for Symmetric TSP.
Ryan Williams: Algorithms for Circuits and Circuits for Algorithms.
The Problem
The problem he spoke about is the now classic on-line server problem. The simplest case is the problem of managing a computer $. Note that this is the way to tell a theory talk on computer caches from an architecture talk on computer caches—in architecture often $ is used to denote a cache. Perhaps this is changing, given the current economic turmoil, perhaps not.
The key is that there are pages in the cache and total pages. Each time a page is requested, if it is not in the cache the task is to decide which page to “evict” from the cache. If the strategy is deterministic, then as Danny Sleator and Bob Tarjan proved back in 1985, the best strategy is only -competitive. This means that the best deterministic strategy could be as bad as times the best off-line strategy which is allowed to see all the page requests at once.
Randomization comes to the rescue. If the strategy for eviction can be random, then it is known that an on-line random strategy exists that is competitive. This is due to Amos Fiat, Dick Karp, Mike Luby, Lyle McGeoch, Danny Sleator, and Neal Young in 1991.
The paging problem is just a special case of the general -server problem. In the general case the simple cache mechanism is replaced by one based on an arbitrary finite metric space. The servers at any step are located somewhere on the points of the metric space. The requests are to “serve” a point in the space: a server must move to that point, and it incurs a cost of the distance in the metric.
In the 1990′s this problem was very “hot,” and there was a stream of results that tried to get good deterministic server strategies. The first results were exponential in but independent of : recall there are servers and the space has points. Finally—in a famous paper—Elias Koutsoupias and Christos Papadimitriou 1994 showed that there is a strategy achieving , which remains today the best known result.
The BBMN Result
The new result is:
Theorem 1 There exists an -competitive on-line strategy for the -server problem.
This is a great improvement, but it also changes the rules. It is great since the competitiveness bound is poly-log in . However, all the previous results we discussed had competitiveness bounds that did not depend on the size of the metric space, . Their result does.
See their paper for the proof. The key idea is that they: Reduce the -server problem over arbitrary metric to a (more difficult) problem over a very simple metric The very simple metric is a type of tree-like metric. The proof relies on the fact that any metric space on points can be well approximated by such a metric space. The cost of the approximation grows roughly as , so it is not surprising that their theorem has factors of in the competitiveness bound.
Open Problems
The -server problem has the following bounds and gaps in those bounds:
If the strategy is deterministic it must take and can be done in . There is a small, but annoying gap of two here. Which is correct?
If the strategy is randomized it must take . There is a huge and annoying gap to the known upper bounds here. There is no known strategy for even the simple case of the real line that is randomized but beats linear in . What is going on?
Finally, are the polylog factors of in their result necessary? In particular can one get rid of the dependence on altogether?
[fixed typo in Fiat et al. to O(log k)]
november 2011 by josephzizys
More Quantum Chocolate Boxes
november 2011 by josephzizys
Unwrapping the layers between classical and quantum?
Daniel Simon discovered the first quantum algorithm that yielded an oracle separation of from an exponential-time analogue of . Even more, his algorithm was the direct inspiration for Peter Shor’s factoring algorithm, which roughly speaking replaced Simon’s use of the Hadamard transform with the quantum Fourier transform. Apologies to Christopher Marlowe, Shor’s algorithm was the one with “the phase that launched a thousand scrips” (scrips meaning papers, funding, you name it).
Today we wish to revisit the question of solving Simon’s problem by a classical algorithm. We also ask whether the classical algorithms are being given a fair counterpart to the following principle, which the quantum algorithms take for granted:
Given an invertible function, there is a unitary matrix that implements the function as a permutation matrix, and moreover: the matrix is efficiently constructible, in a way that is realizable by a physical system.
Let’s call this the quantum black box assumption (QBB). Without it there is no Deutsch(-Jozsa) or Simon or Grover algorithm, and in the details, no Shor algorithm either (compare discussion here). Without it there are no quantum algorithms for anything interesting, since it is a central assumption required by all them. It is justified by polynomial-time universality theorems for quantum circuits, but is it really innocuous? And since the matrix computes the function at any algebraic input, should not the representation for the classical algorithms do the same?
Ironically, Simon himself is not sanguine on the prospects of quantum computers. He works on Windows security at Microsoft, and was among the many builders of the Transport Layer Security (TLS) protocol. In March 2009 he contributed comments noted here, including this statement:
Basically, there is one thing that quantum computers [are known to do] better than classical computers. Because this one thing happens to lead to polynomial-time algorithms for integer factoring and discrete log, quantum computers have been bandied about as an incredible new computing technology, but the truth is that this one thing is really very limited in scope, and in a decade and a half, nobody’s found another significant application for it.
Moreover, there are lots of (admittedly informal) reasons for believing that quantum computers can’t really do anything interesting beyond this one thing.
Simon would add, following the title of his own blog, “I could be wrong.” We could be wrong likewise, but let’s first look at Simon’s problem.
Simon’s Distinguishing Problem
We describe Simon’s problem with the same chocolate-box metaphor we recently used here. Again we are trying to identify a Boolean function , except this time maps into itself, i.e., the output is an -bit string not just 0 or 1. Again we are told that is a special function that belongs to a limited number of “boxes.” Here we have one box for every “hidden” string :
As usual means bitwise exclusive-or. When is the all-zero vector, the condition holds iff is 1-to-1, and so box includes all that are bijectively onto . The other boxes, however, comprise functions that are exactly 2-to-1 in a particular periodic way.
The input to Simon’s problem is an oracle for an unknown member of one of the boxes. The first goal is to distinguish the case from being in one of the other boxes. The full goal is to tell which box, i.e., to compute with high probability. Simon’s beautiful result is that this can be done by a polynomial time quantum algorithm.
As in our previous post, is given as an operator on qubits, such that on input padded with -many 0′s, outputs the pair . The point of necessity is that this computation is 1-to-1 even when itself is 2-to-1.
Some Basics
The key to understanding a quantum algorithm is to look at the vectors and unitary operators. All in this case are from real Hilbert spaces, or if you prefer finite dimensional Euclidean spaces. We will use capital letters for our vectors, and script letters for our matrices. We also avoid subscripts and use the simple notion to denote the coordinate of the vector . If you like we are viewing vectors as functions.
The Hadamard matrix is denoted by where is always . We also use
to denote the row and column of the matrix . Note, the important fact that
where is the boolean inner product of and . If is a vector, then is defined by
Simon’s Quantum Algorithm
The first key insight is that we can use a quantum algorithm to efficiently construct a vector so that
when and is zero otherwise. The is just the concatenation of the numbers and . Essentially we are using two Hilbert spaces that we put together by a tensor product. Note: this is where the QBB assumption is invoked—without this assumption the algorithm cannot even get started—since where is a -qubit state prepared as the equal superposition of basis states in the first qubits, all with zeroes in the second qubits.
Then we can apply the matrix to and yield the vector . The key is that
This is what is meant by applying the Hadamard only to the coordinate.
We need to analyze what the amplitude is of the coordinates . There are two cases.
In this case is one-to-one. Let see what is then. As the sum runs over values the only time is non-zero is when . This only happens once, and therefore
Thus any measurement of will yield simply a random uniform number in the range to . After a number of these samples it will be clear that there is no .
In this case is two-to-one. Let see what is in this case—it is much more interesting. If has no pre-image under , then . If it has a pre-image it also has a pre-image . In this case
This is either or depending on whether or not . This gives an equation on over the boolean finite field. After enough equations one can solve for and solve the problem.
Quantum Versus Classical Settings
All discussions of Simon’s algorithm immediately point out that there is no classical algorithm that can solve the problem without an exponential number of calls to the black box for . This is easy to prove and is unassailable—it is correct. Case closed: quantum is more powerful than classical.
Well of course that is not quite true. If the black box for were presented by a circuit of polynomial size, then the argument that a polynomial amount of classical computing is inadequate would require the assumption that . So the claim that quantum is stronger than classical can be true if the conventional wisdom is true, but who knows?
We suggest that the proper comparison may involve granting the classical setting a multi-linear extension of to use as the black box. This is not restricted to Boolean queries, but can be evaluated on points like that strike us as analogous to Hadamard-transformed states given to . One rub is that if is an unsatisfiable Boolean formula, and comes out to be the all-zero function, recognizing this fact is -hard. If seems too much to assume, compared to the QBB, we are reminded of a common story where one child makes hands like a tank approaching another child, and the second child swoops a hand down like a missile to blow up the tank.
Child 1: “Where did you get that missile?”
Child 2: “Well, where did you get that tank?”
The Classical Black Box Assumption
To make our assumption concrete, we settle on this extension of a Boolean function to a mapping from to :
That is, is the product of factors for and for . For instance, when and ,
The effect is that over Boolean arguments , is zero except when , so that for all Boolean .
Of course for other arguments , can be a sum of exponentially many terms, and it is not evident that can be regarded as easily-given as is. Indeed that is the point. But let’s make the assumption clear:
Given a function with a polynomial size Boolean circuit, there is an arithmetic circuit that is also polynomial in size and computes .
This assertion itself does not force , though obtaining the arithmetic circuit might do so. We will call this assumption the classical black box assumption (CBB). It seems unlikely to be true, but we cannot rule it out immediately. We can also compare with the “Arithmetic Checkability” axiom of this paper by Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova.
And for a companion question, is the QBB assumption really true in prospect? Why is it fair to have a quantum linear operator that can be applied to other quantum states besides the 0-1 basis states, and not be able to do so with a classical multi-linear extension? Or is this where the first power of quantum over classical resides?
Using CBB for Simon’s Problem
Can we use the CBB assumption to create a classical algorithm for Simon’s problem? We would like to report that the answer is yes, but we leave it open. It seems likely to be true, yet we cannot prove it. Still we find the question interesting in its own right, and in shedding new light on QBB.
The goal now is to show that evaluating at the right values will yield information about the structure of , just as Simon’s algorithm uses QBB to get information about . Here is just a note on how to start, and some difficulties that we hope you can help us resolve. When , the “schematic terms” all give value . It follows that whenever is 1-to-1,
When is 2-to-1, it is possible that . If it doesn’t, then we instantly deduce . Whether it does depends completely on the range of , not on the value of . When , this happens exactly when the range of is , or when it is .
Our approach now is to ask, what happens on arguments whose entries are or , perhaps chosen at random? For such we have
where means the number of places where and […]
All_Posts
Ideas
Open_Problems
Algorithms
classical
quantum
randomness
from google
Daniel Simon discovered the first quantum algorithm that yielded an oracle separation of from an exponential-time analogue of . Even more, his algorithm was the direct inspiration for Peter Shor’s factoring algorithm, which roughly speaking replaced Simon’s use of the Hadamard transform with the quantum Fourier transform. Apologies to Christopher Marlowe, Shor’s algorithm was the one with “the phase that launched a thousand scrips” (scrips meaning papers, funding, you name it).
Today we wish to revisit the question of solving Simon’s problem by a classical algorithm. We also ask whether the classical algorithms are being given a fair counterpart to the following principle, which the quantum algorithms take for granted:
Given an invertible function, there is a unitary matrix that implements the function as a permutation matrix, and moreover: the matrix is efficiently constructible, in a way that is realizable by a physical system.
Let’s call this the quantum black box assumption (QBB). Without it there is no Deutsch(-Jozsa) or Simon or Grover algorithm, and in the details, no Shor algorithm either (compare discussion here). Without it there are no quantum algorithms for anything interesting, since it is a central assumption required by all them. It is justified by polynomial-time universality theorems for quantum circuits, but is it really innocuous? And since the matrix computes the function at any algebraic input, should not the representation for the classical algorithms do the same?
Ironically, Simon himself is not sanguine on the prospects of quantum computers. He works on Windows security at Microsoft, and was among the many builders of the Transport Layer Security (TLS) protocol. In March 2009 he contributed comments noted here, including this statement:
Basically, there is one thing that quantum computers [are known to do] better than classical computers. Because this one thing happens to lead to polynomial-time algorithms for integer factoring and discrete log, quantum computers have been bandied about as an incredible new computing technology, but the truth is that this one thing is really very limited in scope, and in a decade and a half, nobody’s found another significant application for it.
Moreover, there are lots of (admittedly informal) reasons for believing that quantum computers can’t really do anything interesting beyond this one thing.
Simon would add, following the title of his own blog, “I could be wrong.” We could be wrong likewise, but let’s first look at Simon’s problem.
Simon’s Distinguishing Problem
We describe Simon’s problem with the same chocolate-box metaphor we recently used here. Again we are trying to identify a Boolean function , except this time maps into itself, i.e., the output is an -bit string not just 0 or 1. Again we are told that is a special function that belongs to a limited number of “boxes.” Here we have one box for every “hidden” string :
As usual means bitwise exclusive-or. When is the all-zero vector, the condition holds iff is 1-to-1, and so box includes all that are bijectively onto . The other boxes, however, comprise functions that are exactly 2-to-1 in a particular periodic way.
The input to Simon’s problem is an oracle for an unknown member of one of the boxes. The first goal is to distinguish the case from being in one of the other boxes. The full goal is to tell which box, i.e., to compute with high probability. Simon’s beautiful result is that this can be done by a polynomial time quantum algorithm.
As in our previous post, is given as an operator on qubits, such that on input padded with -many 0′s, outputs the pair . The point of necessity is that this computation is 1-to-1 even when itself is 2-to-1.
Some Basics
The key to understanding a quantum algorithm is to look at the vectors and unitary operators. All in this case are from real Hilbert spaces, or if you prefer finite dimensional Euclidean spaces. We will use capital letters for our vectors, and script letters for our matrices. We also avoid subscripts and use the simple notion to denote the coordinate of the vector . If you like we are viewing vectors as functions.
The Hadamard matrix is denoted by where is always . We also use
to denote the row and column of the matrix . Note, the important fact that
where is the boolean inner product of and . If is a vector, then is defined by
Simon’s Quantum Algorithm
The first key insight is that we can use a quantum algorithm to efficiently construct a vector so that
when and is zero otherwise. The is just the concatenation of the numbers and . Essentially we are using two Hilbert spaces that we put together by a tensor product. Note: this is where the QBB assumption is invoked—without this assumption the algorithm cannot even get started—since where is a -qubit state prepared as the equal superposition of basis states in the first qubits, all with zeroes in the second qubits.
Then we can apply the matrix to and yield the vector . The key is that
This is what is meant by applying the Hadamard only to the coordinate.
We need to analyze what the amplitude is of the coordinates . There are two cases.
In this case is one-to-one. Let see what is then. As the sum runs over values the only time is non-zero is when . This only happens once, and therefore
Thus any measurement of will yield simply a random uniform number in the range to . After a number of these samples it will be clear that there is no .
In this case is two-to-one. Let see what is in this case—it is much more interesting. If has no pre-image under , then . If it has a pre-image it also has a pre-image . In this case
This is either or depending on whether or not . This gives an equation on over the boolean finite field. After enough equations one can solve for and solve the problem.
Quantum Versus Classical Settings
All discussions of Simon’s algorithm immediately point out that there is no classical algorithm that can solve the problem without an exponential number of calls to the black box for . This is easy to prove and is unassailable—it is correct. Case closed: quantum is more powerful than classical.
Well of course that is not quite true. If the black box for were presented by a circuit of polynomial size, then the argument that a polynomial amount of classical computing is inadequate would require the assumption that . So the claim that quantum is stronger than classical can be true if the conventional wisdom is true, but who knows?
We suggest that the proper comparison may involve granting the classical setting a multi-linear extension of to use as the black box. This is not restricted to Boolean queries, but can be evaluated on points like that strike us as analogous to Hadamard-transformed states given to . One rub is that if is an unsatisfiable Boolean formula, and comes out to be the all-zero function, recognizing this fact is -hard. If seems too much to assume, compared to the QBB, we are reminded of a common story where one child makes hands like a tank approaching another child, and the second child swoops a hand down like a missile to blow up the tank.
Child 1: “Where did you get that missile?”
Child 2: “Well, where did you get that tank?”
The Classical Black Box Assumption
To make our assumption concrete, we settle on this extension of a Boolean function to a mapping from to :
That is, is the product of factors for and for . For instance, when and ,
The effect is that over Boolean arguments , is zero except when , so that for all Boolean .
Of course for other arguments , can be a sum of exponentially many terms, and it is not evident that can be regarded as easily-given as is. Indeed that is the point. But let’s make the assumption clear:
Given a function with a polynomial size Boolean circuit, there is an arithmetic circuit that is also polynomial in size and computes .
This assertion itself does not force , though obtaining the arithmetic circuit might do so. We will call this assumption the classical black box assumption (CBB). It seems unlikely to be true, but we cannot rule it out immediately. We can also compare with the “Arithmetic Checkability” axiom of this paper by Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova.
And for a companion question, is the QBB assumption really true in prospect? Why is it fair to have a quantum linear operator that can be applied to other quantum states besides the 0-1 basis states, and not be able to do so with a classical multi-linear extension? Or is this where the first power of quantum over classical resides?
Using CBB for Simon’s Problem
Can we use the CBB assumption to create a classical algorithm for Simon’s problem? We would like to report that the answer is yes, but we leave it open. It seems likely to be true, yet we cannot prove it. Still we find the question interesting in its own right, and in shedding new light on QBB.
The goal now is to show that evaluating at the right values will yield information about the structure of , just as Simon’s algorithm uses QBB to get information about . Here is just a note on how to start, and some difficulties that we hope you can help us resolve. When , the “schematic terms” all give value . It follows that whenever is 1-to-1,
When is 2-to-1, it is possible that . If it doesn’t, then we instantly deduce . Whether it does depends completely on the range of , not on the value of . When , this happens exactly when the range of is , or when it is .
Our approach now is to ask, what happens on arguments whose entries are or , perhaps chosen at random? For such we have
where means the number of places where and […]
november 2011 by josephzizys
Poker and Cantor’s Proof
october 2011 by josephzizys
The reals are still uncountable
Poker Source
Dave Patterson is a world-famous expert on computer architecture. He has won many awards, and is a member of the National Academy of Engineering and the National Academy of Sciences. He coined the term “RISC” and later the term “RAID.” He also wrote definitive textbooks on computer architecture, with John Hennessy, including the famous Computer Architecture: A Quantitative Approach..
Today I and Ken want to continue the last discussion on Georg Cantor’s Diagonal Proof. My goal is to try to answer one of the issues raised about this famous proof. I do thank all the commenters for an interesting discussion.
What does this have to do with Dave? He is not a theorist, and may know Cantor’s proof—but certainly is not excited about it. He is very practical-oriented: his most famous work is on the design of Reduced Instruction Set Computing processors.
But Dave is a poker player. When I arrived at Berkeley in the late 1970′s, I wanted to bring one tradition from Yale to Berkeley: playing a “friendly” game of cards on a regular basis. I assumed that my theory colleagues, Dick Karp and others, would be interested in playing. The game I had in mine was low stakes poker. The rules were “dealer’s choice” and modified “pot limit.” The latter means one could make a bet that was bounded by half the current pot. This method of betting was introduced to me by Albert Meyer, and it can lead to exponential increases in bets. This increase makes the game exciting—I felt—and allows for bluffing. Even through the stakes are low, the game can be quite exciting.
In any event I was sure that my theory friends at Berkeley would jump at a chance to put probability theory to a practical test. I was wrong. Every member of the theory group said: not interested, sorry. The sole possible exception was Karp, but poker requires players for much greater than to be fun. I was about to give up when I mentioned it one day to Patterson. Before I knew it all the computer architecture group—well most—had signed up, and a new course was created:
CS 52
When you got the message, “CS 52 this week at X’s house,” you knew the game was on. We played for the two years that I was at Berkeley, and had a great deal of fun. I believe that they still play poker there, although the game has changed to Texas Hold ‘Em and the betting is different. I believe that it is one of my great and lasting contributions to Berkeley Computer Science Department. Perhaps they will put up a plaque that says…
Show Your Hand
In almost any version of Poker—Texas Hold ‘Em, Stud, Draw, or any other variant—there comes a time when all the bets and raises and re-raises are done. At this point, if you are the person who last bet or raised, you must show your hand, that is, must lay your cards on the table. That’s part of the meaning of saying that you have been called. Then the other players can either throw their hands away or show their cards and prove that they have a better hand. The winner takes the money.
If you are called you must show your cards. You are not allowed, if called, to say I will show my cards later; or go through the deck and change you hand. The game has come to the showdown—you must show what you have. Not anything else. Any attempt to not do this is considered cheating—do not try this in a real game.
The same principle of showing your hand applies to mathematical proofs. For instance, if you are a Count and thus believe that the reals are countable, then you must be prepared to give a listing of the reals—essentially show your cards. This is where most Counts have trouble: they want to be able to change their “hands.” Tim Gowers said it best when he wrote that he usually asks a Count to present the actual listing. Essentially Tim is calling them, and demanding that they show their hand.
Note that this requirement is standard math, and has nothing at all to do with countable sets, uncountable sets, infinities, or the reals. If you are claiming that is rational, you must be prepared to agree that there is some fixed and natural numbers, so that
You are not allowed to change the values of and during the proof. Either there is a fixed or there is not.
Another example is Euclid’s classic proof that there are an infinite number of primes. If you believed that there is a largest prime, then you must be prepared to say that is the largest prime. Then, Euclid can use this number to force you to consider
The argument then shows that is either prime or is divisible by a prime , where must be larger than .
An even simpler version of this is the “largest number game.” You name a number and then I have to name a larger one. Again if you think that there is a largest one, you must be prepared to say that the largest one is , say. Then I point out that is larger. Scott Aaronson retold here an old story based on this game:
Two noblemen vie to name the bigger number. The first, after ruminating for hours, triumphantly announces “Eighty-three!” The second, mightily impressed, replies “You win.”
Back To The Proof
If you believe the reals are countable, then you must show your hand. You must fix a listing of the reals:
This is nothing different from picking a pair to show that is rational, that there is a largest prime , or even that there is a largest number . This is the show your hand step. If you disagree with this step in Cantor’s proof, then you must also do the same in the above simple examples. You also must not play Poker by the standard rules.
Once I have seen your hand, the sequence , I can try and find a real that you missed. There are many ways that I can do this. Since I can always construct a real number based on your sequence, your hand, that is not any , I win. Your sequence does not contain all the reals.
Leading Out Versus Check-Raising
Ken on the other hand recalls he did not play a single hand of poker during 5-1/2 years at Oxford, and has picked up poker recently only because of the interest of his son. But he finds that his own angle on Cantor’s theorem, informed by his exposure to finitists at Oxford, fits in well with the poker analogy.
In poker a player whose turn to bet comes first is “under the gun” and often at a disadvantage. Assertively “leading out” with a sizable bet gives opponents more leverage to raise since more of the player’s own money is in the pot. Instead the player is allowed to check, i.e. to bet zero, and subsequent players may check until a bet is made. If an opponent bets, then the first player is in the situation of being able to raise, not just call. A raise then is called a check-raise.
The math analogy is that asserting the existence of a set that is in some sense more complex than any set previously admitted is a form of leading out. If we’ve only reached a comfort zone with countable sets, then an uncountable set represents a new level of complexity. Indeed, to a finitist, every power-set is a big step up in complexity. If you think a googolplex is a meaningless quantity, then you can only iterate for a few steps before you would pass it.
Leading out with an object on a new level of complexity takes chutzpah. But even a humble finitist may be comfortable with theorems of the form, “If you have an object then I can give you a .” This is like saying “I check”—let someone else bet . If has the same intuitive complexity level as then that is like a call. If is a step up in complexity from , but similar to how was a step up from whatever was “in the pot” already, then is a check-raise.
The “Humble” Cantor Theorem
Here is a form of Cantor’s Theorem that embodies the check-call or check-raise idea, depending on how it is applied:
Theorem 1 Given any mapping from a set into its power set , we can find a subset of that is not in the range of .
Proof: Define . We claim that is not in the range of . If it were, then there would exist a such that . But then we have:
is in the set
by definition of
is not in the set
by definition of .
Since a logical statement can never be equivalent to its negation, this contradicts being in the range of .
Note that the proof itself embodies Cantor’s diagonal argument, but makes no reference to infinite constructions or possibilities, indeed no reference to infinity at all. It is not asserting anything on its own, but reacting to someone else giving an . Since has roughly the same complexity as —both involve the power set—producing is like a call.
What this has done is bring the implied onus of the bettor to show one’s hand into the theorem statement itself. It puts the question to the Count right away. For simplicity, let’s first talk about the power set of being uncountable:
If you think that is countable, then you are giving me a function from to that is onto . You can’t give me such a function, because I can give you and it will show that you were bluffing.
With a little technical work a Count should be convinced that a bluff on is equivalent to a bluff on . Thus the Count must admit that one can never have the cards to win a showdown that the reals are countable, and by “never” itself, should come on board that they are uncountable.
Raising Higher
One nice point about this formulation is that it carries over verbatim to prove the existence of non-r.e. sets. Just take to be the mapping from a Gödel number to the language accepted by the Turing machine , and what tumbles out is the creation of a diagonal language . Since is more complex than any in the sense of being non-r.e., we can call that a check-raise.
How widely can one replace theorems of the form “There exists a ” with “humbler” ones of the form “If you give me an , then I[…]
All_Posts
History
Ideas
Oldies
People
Proofs
from google
Poker Source
Dave Patterson is a world-famous expert on computer architecture. He has won many awards, and is a member of the National Academy of Engineering and the National Academy of Sciences. He coined the term “RISC” and later the term “RAID.” He also wrote definitive textbooks on computer architecture, with John Hennessy, including the famous Computer Architecture: A Quantitative Approach..
Today I and Ken want to continue the last discussion on Georg Cantor’s Diagonal Proof. My goal is to try to answer one of the issues raised about this famous proof. I do thank all the commenters for an interesting discussion.
What does this have to do with Dave? He is not a theorist, and may know Cantor’s proof—but certainly is not excited about it. He is very practical-oriented: his most famous work is on the design of Reduced Instruction Set Computing processors.
But Dave is a poker player. When I arrived at Berkeley in the late 1970′s, I wanted to bring one tradition from Yale to Berkeley: playing a “friendly” game of cards on a regular basis. I assumed that my theory colleagues, Dick Karp and others, would be interested in playing. The game I had in mine was low stakes poker. The rules were “dealer’s choice” and modified “pot limit.” The latter means one could make a bet that was bounded by half the current pot. This method of betting was introduced to me by Albert Meyer, and it can lead to exponential increases in bets. This increase makes the game exciting—I felt—and allows for bluffing. Even through the stakes are low, the game can be quite exciting.
In any event I was sure that my theory friends at Berkeley would jump at a chance to put probability theory to a practical test. I was wrong. Every member of the theory group said: not interested, sorry. The sole possible exception was Karp, but poker requires players for much greater than to be fun. I was about to give up when I mentioned it one day to Patterson. Before I knew it all the computer architecture group—well most—had signed up, and a new course was created:
CS 52
When you got the message, “CS 52 this week at X’s house,” you knew the game was on. We played for the two years that I was at Berkeley, and had a great deal of fun. I believe that they still play poker there, although the game has changed to Texas Hold ‘Em and the betting is different. I believe that it is one of my great and lasting contributions to Berkeley Computer Science Department. Perhaps they will put up a plaque that says…
Show Your Hand
In almost any version of Poker—Texas Hold ‘Em, Stud, Draw, or any other variant—there comes a time when all the bets and raises and re-raises are done. At this point, if you are the person who last bet or raised, you must show your hand, that is, must lay your cards on the table. That’s part of the meaning of saying that you have been called. Then the other players can either throw their hands away or show their cards and prove that they have a better hand. The winner takes the money.
If you are called you must show your cards. You are not allowed, if called, to say I will show my cards later; or go through the deck and change you hand. The game has come to the showdown—you must show what you have. Not anything else. Any attempt to not do this is considered cheating—do not try this in a real game.
The same principle of showing your hand applies to mathematical proofs. For instance, if you are a Count and thus believe that the reals are countable, then you must be prepared to give a listing of the reals—essentially show your cards. This is where most Counts have trouble: they want to be able to change their “hands.” Tim Gowers said it best when he wrote that he usually asks a Count to present the actual listing. Essentially Tim is calling them, and demanding that they show their hand.
Note that this requirement is standard math, and has nothing at all to do with countable sets, uncountable sets, infinities, or the reals. If you are claiming that is rational, you must be prepared to agree that there is some fixed and natural numbers, so that
You are not allowed to change the values of and during the proof. Either there is a fixed or there is not.
Another example is Euclid’s classic proof that there are an infinite number of primes. If you believed that there is a largest prime, then you must be prepared to say that is the largest prime. Then, Euclid can use this number to force you to consider
The argument then shows that is either prime or is divisible by a prime , where must be larger than .
An even simpler version of this is the “largest number game.” You name a number and then I have to name a larger one. Again if you think that there is a largest one, you must be prepared to say that the largest one is , say. Then I point out that is larger. Scott Aaronson retold here an old story based on this game:
Two noblemen vie to name the bigger number. The first, after ruminating for hours, triumphantly announces “Eighty-three!” The second, mightily impressed, replies “You win.”
Back To The Proof
If you believe the reals are countable, then you must show your hand. You must fix a listing of the reals:
This is nothing different from picking a pair to show that is rational, that there is a largest prime , or even that there is a largest number . This is the show your hand step. If you disagree with this step in Cantor’s proof, then you must also do the same in the above simple examples. You also must not play Poker by the standard rules.
Once I have seen your hand, the sequence , I can try and find a real that you missed. There are many ways that I can do this. Since I can always construct a real number based on your sequence, your hand, that is not any , I win. Your sequence does not contain all the reals.
Leading Out Versus Check-Raising
Ken on the other hand recalls he did not play a single hand of poker during 5-1/2 years at Oxford, and has picked up poker recently only because of the interest of his son. But he finds that his own angle on Cantor’s theorem, informed by his exposure to finitists at Oxford, fits in well with the poker analogy.
In poker a player whose turn to bet comes first is “under the gun” and often at a disadvantage. Assertively “leading out” with a sizable bet gives opponents more leverage to raise since more of the player’s own money is in the pot. Instead the player is allowed to check, i.e. to bet zero, and subsequent players may check until a bet is made. If an opponent bets, then the first player is in the situation of being able to raise, not just call. A raise then is called a check-raise.
The math analogy is that asserting the existence of a set that is in some sense more complex than any set previously admitted is a form of leading out. If we’ve only reached a comfort zone with countable sets, then an uncountable set represents a new level of complexity. Indeed, to a finitist, every power-set is a big step up in complexity. If you think a googolplex is a meaningless quantity, then you can only iterate for a few steps before you would pass it.
Leading out with an object on a new level of complexity takes chutzpah. But even a humble finitist may be comfortable with theorems of the form, “If you have an object then I can give you a .” This is like saying “I check”—let someone else bet . If has the same intuitive complexity level as then that is like a call. If is a step up in complexity from , but similar to how was a step up from whatever was “in the pot” already, then is a check-raise.
The “Humble” Cantor Theorem
Here is a form of Cantor’s Theorem that embodies the check-call or check-raise idea, depending on how it is applied:
Theorem 1 Given any mapping from a set into its power set , we can find a subset of that is not in the range of .
Proof: Define . We claim that is not in the range of . If it were, then there would exist a such that . But then we have:
is in the set
by definition of
is not in the set
by definition of .
Since a logical statement can never be equivalent to its negation, this contradicts being in the range of .
Note that the proof itself embodies Cantor’s diagonal argument, but makes no reference to infinite constructions or possibilities, indeed no reference to infinity at all. It is not asserting anything on its own, but reacting to someone else giving an . Since has roughly the same complexity as —both involve the power set—producing is like a call.
What this has done is bring the implied onus of the bettor to show one’s hand into the theorem statement itself. It puts the question to the Count right away. For simplicity, let’s first talk about the power set of being uncountable:
If you think that is countable, then you are giving me a function from to that is onto . You can’t give me such a function, because I can give you and it will show that you were bluffing.
With a little technical work a Count should be convinced that a bluff on is equivalent to a bluff on . Thus the Count must admit that one can never have the cards to win a showdown that the reals are countable, and by “never” itself, should come on board that they are uncountable.
Raising Higher
One nice point about this formulation is that it carries over verbatim to prove the existence of non-r.e. sets. Just take to be the mapping from a Gödel number to the language accepted by the Turing machine , and what tumbles out is the creation of a diagonal language . Since is more complex than any in the sense of being non-r.e., we can call that a check-raise.
How widely can one replace theorems of the form “There exists a ” with “humbler” ones of the form “If you give me an , then I[…]
october 2011 by josephzizys
Copy this bookmark: