josephzizys + algorithms   9

The Meaning of Omega
And the meaning of the recently-broken metric TSP approximation ratio

Andrew Stothers wrote in 2007 a first-year qualifying report for the University of Edinburgh PhD programme. In only eleven pages he presented the parts of the famous paper by Don Coppersmith and Shmuel Winograd (CW) that do enough to obtain for the exponent of matrix multiplication. In a few more pages he surveyed recent work on sufficient conditions for . Page 15, the last except for references, laid out “Possible Directions for Future Work” in five sentences, the fifth being:

Finally, it will also be possible to investigate bilinear algorithms further and use more complex means to determine whether or not “better” algorithms exist for matrix multiplication.

Today we wish to explain what means, why lower is always “better,” why we feel our field is better for the great efforts expended by Stothers and Virginia Vassilevska Williams, and why it is better when intellectual values guide our judgment. We also wish to defend ourselves and others for calling the results “breakthroughs,” comparing and contrasting to recent advances on algorithms for the Traveling Salesman Problem.

I (Ken) wrote a similar first-year qualifier in summer 1982 after my first year at Oxford—they are a staple of leading British universities. Like Andrew’s it was not obligated to have original research, just to demonstrate understanding and capability. One item found its way into a conference paper, but my 1986 dissertation ranged into various other topics.

What distinguishes Andrew’s Nov. 2010 dissertation is its singular focus on accomplishing the objective of his proposal. And doing it. The resulting paper with his advisor Sandy Davie has recently been submitted. Meanwhile in North America, Virginia had been thinking about the same problem since 2005. It may not be true as asserted here that “hundreds of people tried to improve” it after 1987, but we’ll guess about as many people thought about it. It’s enough to note that many researchers felt it worthy of pursuit, including Andrew’s advisor, and that progress took large concerted effort. The nature of this effort—including the use of computers—may be an important harbinger for science by itself.

What Does ω Mean?

The definition of is the infimum of all such that matrix multiplication has an algorithm that runs in time . This is in a classical model of computation where addition and multiplication of scalars have unit cost. Note that there might not be an algorithm that runs in time itself— may properly be a limit even if it turns out to have the least possible value 2. Currently all we know is that from Virginia’s work.

As many have noted, does not have immediate practical relevance because there is evidently a tradeoff with the constant hiding under the in . The tradeoff is so far steep enough that apparently only Volker Strassen’s relatively simple algorithm achieving an exponent of has wide use. So why do we care about values of less than that?

One of the central questions in physics is, why do the fundamental constants of nature have the values they do? The desire for why, in the face of evidence that the values could be arbitrary, has aroused such passion that string theorist David Gross channeled Winston Churchill in exhorting young physicists to “never, never, never give up.”

Now in mathematics it may seem nonsensical to ask, why do numbers like and have the values they do? But with , we are in a sense closer to physics: understanding it is akin to discovering a natural law, at least a law of information. And quantum mechanics has if anything magnified William Hamilton’s vision approaching 200 years ago that Nature computes with matrices, yes large ones. It is possible that detracts from a better statement of important laws, but knowing better brings us closer to them all the same.

Barriers and Breakthroughs

Hence also the interest in whether takes a value with known meaning, such as or (both falsified) or , or as some have mused, or , or as some have argued more likely for exponents, a logarithm of some higher and simpler number. Strassen himself has often stated his belief that is strictly greater than . If it has a value not previously seen in mathematics, then we could hope to discover new mathematical regularities as well; if it has a known value, then this may yield some more explanation about algorithms.

The value seemed a natural possible barrier, but the time interval from to in 1981–82 was very short. By contrast the “CW” bound of was not a natural-seeming number, but it withstood attempts to scale it for 23 years, including a year’s work from each of Virginia and Andrew. Reasons of intellectual judgment, expanded below, we feel are enough to justify calling their final results a “breakthrough.” But we offer a recent concrete case for comparison first.

The Salesman Always Rings 1.5

The Traveling Salesman Problem (TSP) seeks the shortest route to visit each of a given set of sites, coming back in a ring to the starting site. When the sites are in real space and “shortest” refers to Euclidean distance, finding the length of the absolute shortest route remains -hard even in the plane, as discussed here. However, Nicos Christofides in 1976 found a polynomial-time algorithm that finds a tour guaranteed to have total length at most . The method and proof apply to various other cases where the distances between sites obey inequalities that define some kind of metric, with the same bound.

This stood as a barrier for most of these cases, until this year. Shayan Oveis Gharan, Amit Saberi, and Mohit Singh (OSS) broke it for a wide subclass of these problems. Well they “broke” it by achieving a guaranteed ratio to the optimum of

Actually their original paper did not give a value—it merely proved the existence of an giving . The estimate for the magnitude of their breakthrough was supplied later.

Improvements and Predictions

It must be said that the OSS paper and its predecessor introduced techniques that were more novel to the problem than is evident for the new matrix-multiplication papers, and the predecessor won the SODA 2010 “Best Paper” award. However, a referee could have wondered, why bother moving heaven and earth in a restricted case to demonstrate an increment a thousand times smaller than “nano”? Indeed the version of OSS linked above still ends with “” on page 65.

What happened is that the change attracted interest on several continents, and the people in Europe who made a bankable improvement are not even in the euro zone. First Tobias Mömke and Ola Svensson of KTH in Stockholm obtained by using matchings in place of the more-complicated ingredients of OSS, also in time for FOCS 2011. Then Marcin Mucha of the University of Warsaw improved their analysis to obtain

It is noted by all that an even older paper by Michael Held and Richard Karp from 1970 had set a believable target for provable improvements of Christofides’ bound, namely a conjectured ratio of from linear programming relaxations. However, also has dueling conjectures, some supporting a believable target of . The real point of comparison with TSP is that suddenly there was progress on a barrier that had stood for much of the age of the field, and this attracted others to try for more.

We note that Markus Bläser has contended in a comment that the extension of CW used by both new papers has limitations, and we infer that some other experts concur. However, the paper by Vassilevska Williams in particular has two new ingredients: a framework for managing any tensor power as a base algorithm, and a computer implementation of the framework. She also employed an insight of Stothers to simplify the analysis. Though we can imagine a geometrical insight that higher powers bring returns that “shrink geometrically,” can we really constrain the likelihood that pursuing them might dislodge a different insight that changes the whole game? Is there a limitation theorem here? All we know is that the game has changed, and perhaps the game is afoot.

Worth and Values

The last issue we wish to raise is the reliability of judging the worth of past achievement by present assessment of what it may or may not lead to in the future. We have posted several times on surprises and the difficulty of predicting or guessing how things will go. Of course impetus into the future is necessarily part of claiming something now a “breakthrough.”

However, if we seek a reliable and consistent standard of judging worth, we should use a value that the community has understood for many years: intellectual substance and effort. This value, plus the simple salience of the goal, led our principals to invest the years of work in the first place. Depth of thinking is our gold standard, while applicability is our paper currency. The improvements of and on paper for amid other issues may be a poor return now, but the new computer-assisted vein to mine may parlay true value later.

Dick and I hope this explains why fundamental effort and easily-stated achievement, on a fundamental problem after nearly a quarter-century of stasis, elicited the reaction it has. We agree with, and have tried to extend, Timothy Gowers’ comments here. As for how these values can be invested, only time will be the teller.

Open Problems

Can we be more open about the value of pursuing problems?

What is ?

Suppose we know something special about two matrices and to be multiplied, something not so obvious like their being sparse. Suppose in particular that we have a tiny circuit that for any outputs the value , and similarly for , where those values come from a fixed finite set. Or suppose we know that and/or preserve a similar succinctness property from argument vectors to values—which[…]
All_Posts  Ideas  News  Open_Problems  People  Proofs  Results  Algorithms  approximation  matrix  multiplication  omega  TSP  from google
december 2011 by josephzizys
Another Annoying Open Problem
The k-server problem: some progress, still wide open

Aleksander Mądry gave a stellar talk at our ARC Theory Day a week ago Friday. He is an expert on algorithmic graph theory, among other areas of theory. Already he has multiple best-paper awards, including the paper of this talk from FOCS 2011, and I expect he will get more in the future. This is good going considering that basic LaTeX systems lack a macro for the Polish ogonek diacritic in his name—Ken installed a special package called tipa to get it while editing this post.

Today I would like to talk about his result and the open questions that remain.

He with Nikhil Bansal, Niv Buchbinder, and Seffi Naor (BBMN) have made a recent important contribution to our understanding of the complexity of on-line algorithms. Aleksander’s talk was wonderful and explained what is known, their new result, and what still remains to be done. As he said:

One of the remaining open questions is really annoying—it should be solved—but it continues to resist attacks.

Before I discuss their result I would like to thank the Theory Day organizers Prasad Tetali and Prasad Raghavendra. It was held on the magical day 11/11/11, which does not happen that often, and in fact Mądry’s talk spanned 11:11:11am. Ken wore a red shirt in Buffalo but could not find the WW I remembrance poppy pin he acquired during his sabbatical in Canada. The full scorecard of talks and titles was:

Thomas Dueholm Hansen: Subexponential lower bounds for randomized pivoting rules for the simplex algorithm.

Aleksander Mądry: Online algorithms and the k-server conjecture.

Mohit Singh: A Randomized Rounding Approach for Symmetric TSP.

Ryan Williams: Algorithms for Circuits and Circuits for Algorithms.

The Problem

The problem he spoke about is the now classic on-line server problem. The simplest case is the problem of managing a computer $. Note that this is the way to tell a theory talk on computer caches from an architecture talk on computer caches—in architecture often $ is used to denote a cache. Perhaps this is changing, given the current economic turmoil, perhaps not.

The key is that there are pages in the cache and total pages. Each time a page is requested, if it is not in the cache the task is to decide which page to “evict” from the cache. If the strategy is deterministic, then as Danny Sleator and Bob Tarjan proved back in 1985, the best strategy is only -competitive. This means that the best deterministic strategy could be as bad as times the best off-line strategy which is allowed to see all the page requests at once.

Randomization comes to the rescue. If the strategy for eviction can be random, then it is known that an on-line random strategy exists that is competitive. This is due to Amos Fiat, Dick Karp, Mike Luby, Lyle McGeoch, Danny Sleator, and Neal Young in 1991.

The paging problem is just a special case of the general -server problem. In the general case the simple cache mechanism is replaced by one based on an arbitrary finite metric space. The servers at any step are located somewhere on the points of the metric space. The requests are to “serve” a point in the space: a server must move to that point, and it incurs a cost of the distance in the metric.

In the 1990′s this problem was very “hot,” and there was a stream of results that tried to get good deterministic server strategies. The first results were exponential in but independent of : recall there are servers and the space has points. Finally—in a famous paper—Elias Koutsoupias and Christos Papadimitriou 1994 showed that there is a strategy achieving , which remains today the best known result.

The BBMN Result

The new result is:

Theorem 1 There exists an -competitive on-line strategy for the -server problem.

This is a great improvement, but it also changes the rules. It is great since the competitiveness bound is poly-log in . However, all the previous results we discussed had competitiveness bounds that did not depend on the size of the metric space, . Their result does.

See their paper for the proof. The key idea is that they: Reduce the -server problem over arbitrary metric to a (more difficult) problem over a very simple metric The very simple metric is a type of tree-like metric. The proof relies on the fact that any metric space on points can be well approximated by such a metric space. The cost of the approximation grows roughly as , so it is not surprising that their theorem has factors of in the competitiveness bound.

Open Problems

The -server problem has the following bounds and gaps in those bounds:

If the strategy is deterministic it must take and can be done in . There is a small, but annoying gap of two here. Which is correct?

If the strategy is randomized it must take . There is a huge and annoying gap to the known upper bounds here. There is no known strategy for even the simple case of the real line that is randomized but beats linear in . What is going on?
Finally, are the polylog factors of in their result necessary? In particular can one get rid of the dependence on altogether?

[fixed typo in Fiat et al. to O(log k)]
1  All_Posts  News  Open_Problems  Results  Algorithms  approximation  deterministic  lower_bounds  randomness  from google
november 2011 by josephzizys
More Quantum Chocolate Boxes
Unwrapping the layers between classical and quantum?

Daniel Simon discovered the first quantum algorithm that yielded an oracle separation of from an exponential-time analogue of . Even more, his algorithm was the direct inspiration for Peter Shor’s factoring algorithm, which roughly speaking replaced Simon’s use of the Hadamard transform with the quantum Fourier transform. Apologies to Christopher Marlowe, Shor’s algorithm was the one with “the phase that launched a thousand scrips” (scrips meaning papers, funding, you name it).

Today we wish to revisit the question of solving Simon’s problem by a classical algorithm. We also ask whether the classical algorithms are being given a fair counterpart to the following principle, which the quantum algorithms take for granted:

Given an invertible function, there is a unitary matrix that implements the function as a permutation matrix, and moreover: the matrix is efficiently constructible, in a way that is realizable by a physical system.

Let’s call this the quantum black box assumption (QBB). Without it there is no Deutsch(-Jozsa) or Simon or Grover algorithm, and in the details, no Shor algorithm either (compare discussion here). Without it there are no quantum algorithms for anything interesting, since it is a central assumption required by all them. It is justified by polynomial-time universality theorems for quantum circuits, but is it really innocuous? And since the matrix computes the function at any algebraic input, should not the representation for the classical algorithms do the same?

Ironically, Simon himself is not sanguine on the prospects of quantum computers. He works on Windows security at Microsoft, and was among the many builders of the Transport Layer Security (TLS) protocol. In March 2009 he contributed comments noted here, including this statement:

Basically, there is one thing that quantum computers [are known to do] better than classical computers. Because this one thing happens to lead to polynomial-time algorithms for integer factoring and discrete log, quantum computers have been bandied about as an incredible new computing technology, but the truth is that this one thing is really very limited in scope, and in a decade and a half, nobody’s found another significant application for it.

Moreover, there are lots of (admittedly informal) reasons for believing that quantum computers can’t really do anything interesting beyond this one thing.

Simon would add, following the title of his own blog, “I could be wrong.” We could be wrong likewise, but let’s first look at Simon’s problem.

Simon’s Distinguishing Problem

We describe Simon’s problem with the same chocolate-box metaphor we recently used here. Again we are trying to identify a Boolean function , except this time maps into itself, i.e., the output is an -bit string not just 0 or 1. Again we are told that is a special function that belongs to a limited number of “boxes.” Here we have one box for every “hidden” string :

As usual means bitwise exclusive-or. When is the all-zero vector, the condition holds iff is 1-to-1, and so box includes all that are bijectively onto . The other boxes, however, comprise functions that are exactly 2-to-1 in a particular periodic way.

The input to Simon’s problem is an oracle for an unknown member of one of the boxes. The first goal is to distinguish the case from being in one of the other boxes. The full goal is to tell which box, i.e., to compute with high probability. Simon’s beautiful result is that this can be done by a polynomial time quantum algorithm.

As in our previous post, is given as an operator on qubits, such that on input padded with -many 0′s, outputs the pair . The point of necessity is that this computation is 1-to-1 even when itself is 2-to-1.

Some Basics

The key to understanding a quantum algorithm is to look at the vectors and unitary operators. All in this case are from real Hilbert spaces, or if you prefer finite dimensional Euclidean spaces. We will use capital letters for our vectors, and script letters for our matrices. We also avoid subscripts and use the simple notion to denote the coordinate of the vector . If you like we are viewing vectors as functions.

The Hadamard matrix is denoted by where is always . We also use

to denote the row and column of the matrix . Note, the important fact that

where is the boolean inner product of and . If is a vector, then is defined by

Simon’s Quantum Algorithm

The first key insight is that we can use a quantum algorithm to efficiently construct a vector so that

when and is zero otherwise. The is just the concatenation of the numbers and . Essentially we are using two Hilbert spaces that we put together by a tensor product. Note: this is where the QBB assumption is invoked—without this assumption the algorithm cannot even get started—since where is a -qubit state prepared as the equal superposition of basis states in the first qubits, all with zeroes in the second qubits.

Then we can apply the matrix to and yield the vector . The key is that

This is what is meant by applying the Hadamard only to the coordinate.

We need to analyze what the amplitude is of the coordinates . There are two cases.

In this case is one-to-one. Let see what is then. As the sum runs over values the only time is non-zero is when . This only happens once, and therefore

Thus any measurement of will yield simply a random uniform number in the range to . After a number of these samples it will be clear that there is no .

In this case is two-to-one. Let see what is in this case—it is much more interesting. If has no pre-image under , then . If it has a pre-image it also has a pre-image . In this case

This is either or depending on whether or not . This gives an equation on over the boolean finite field. After enough equations one can solve for and solve the problem.

Quantum Versus Classical Settings

All discussions of Simon’s algorithm immediately point out that there is no classical algorithm that can solve the problem without an exponential number of calls to the black box for . This is easy to prove and is unassailable—it is correct. Case closed: quantum is more powerful than classical.

Well of course that is not quite true. If the black box for were presented by a circuit of polynomial size, then the argument that a polynomial amount of classical computing is inadequate would require the assumption that . So the claim that quantum is stronger than classical can be true if the conventional wisdom is true, but who knows?

We suggest that the proper comparison may involve granting the classical setting a multi-linear extension of to use as the black box. This is not restricted to Boolean queries, but can be evaluated on points like that strike us as analogous to Hadamard-transformed states given to . One rub is that if is an unsatisfiable Boolean formula, and comes out to be the all-zero function, recognizing this fact is -hard. If seems too much to assume, compared to the QBB, we are reminded of a common story where one child makes hands like a tank approaching another child, and the second child swoops a hand down like a missile to blow up the tank.

Child 1: “Where did you get that missile?”

Child 2: “Well, where did you get that tank?”

The Classical Black Box Assumption

To make our assumption concrete, we settle on this extension of a Boolean function to a mapping from to :

That is, is the product of factors for and for . For instance, when and ,

The effect is that over Boolean arguments , is zero except when , so that for all Boolean .

Of course for other arguments , can be a sum of exponentially many terms, and it is not evident that can be regarded as easily-given as is. Indeed that is the point. But let’s make the assumption clear:

Given a function with a polynomial size Boolean circuit, there is an arithmetic circuit that is also polynomial in size and computes .

This assertion itself does not force , though obtaining the arithmetic circuit might do so. We will call this assumption the classical black box assumption (CBB). It seems unlikely to be true, but we cannot rule it out immediately. We can also compare with the “Arithmetic Checkability” axiom of this paper by Russell Impagliazzo, Valentine Kabanets, and Antonina Kolokolova.

And for a companion question, is the QBB assumption really true in prospect? Why is it fair to have a quantum linear operator that can be applied to other quantum states besides the 0-1 basis states, and not be able to do so with a classical multi-linear extension? Or is this where the first power of quantum over classical resides?

Using CBB for Simon’s Problem

Can we use the CBB assumption to create a classical algorithm for Simon’s problem? We would like to report that the answer is yes, but we leave it open. It seems likely to be true, yet we cannot prove it. Still we find the question interesting in its own right, and in shedding new light on QBB.

The goal now is to show that evaluating at the right values will yield information about the structure of , just as Simon’s algorithm uses QBB to get information about . Here is just a note on how to start, and some difficulties that we hope you can help us resolve. When , the “schematic terms” all give value . It follows that whenever is 1-to-1,

When is 2-to-1, it is possible that . If it doesn’t, then we instantly deduce . Whether it does depends completely on the range of , not on the value of . When , this happens exactly when the range of is , or when it is .

Our approach now is to ask, what happens on arguments whose entries are or , perhaps chosen at random? For such we have

where means the number of places where and […]
All_Posts  Ideas  Open_Problems  Algorithms  classical  quantum  randomness  from google
november 2011 by josephzizys
Something you should know about: Quantifier Elimination (Part I)
by Arnab Bhattacharyya

 

About a month ago, Ankur Moitra dropped by my office. We started chatting about what each of us was up to. He told me a story about a machine learning problem that he was working on with Sanjeev Arora, Rong Ge, and Ravi Kannan. On its face, it was not even clear that the problem (non-negative rank) was decidable, let alone solvable in polynomial time. But on the other hand, they observed that previous work had already shown the existence of an algorithm using quantifier elimination. Ankur was a little taken aback by the claim, by the power of quantifier elimination. He knew of the theory somewhere in the back of his mind, in the same way that you probably know of Brownian motion or universal algebra (possible future topics in this “Something you should know about” series!), but he’d never had the occasion to really use it till then. On the train ride back home, he realized that quantifier elimination not only showed decidability of the problem but could also be helpful in devising a more efficient algorithm.

Quantifier elimination has a bit of a magical feel to it. After the conversation with Ankur, I spent some time revisiting the area, and this post is a consequence of that. I’ll mainly focus on the theory over the reals. It’s a remarkable result that you definitely should know about!

1. What is Quantifier Elimination?

A zeroth-order logic deals with declarative propositions that evaluate to either true or false. It is defined by a set of symbols, a set of logical operators, some inference rules and some axioms. A first-order logic adds functions, relations and quantifiers to the mix. Some examples of sentences in a first-order logic:

\(\displaystyle \forall x~ \exists y~(y = x^2)\)

\(\displaystyle \forall y~ \exists x~(y = x^2)\)

\(\displaystyle \forall x,y~((x+y)^2 > 4xy \wedge x-y>0)\)

The above are examples of sentences, meaning that they contain no free variables (i.e., variables that are not bounded by the quantifiers), whereas a formula \({\phi(x_1,\dots,x_n)}\) has \({x_1, \dots, x_n}\) as free variables. A quantifier-free formula is one in which no variable in the formula is quantified. Thus, note that a quantifier-free sentence is simply a proposition.

Definition 1 A first-order logic is said to admit quantifier elimination if for any formula \({\phi(x_1,\dots,x_n)}\), there exists a quantifier-free formula \({\psi(x_1,\dots,x_n)}\) which is logically equivalent to \({\phi(x_1,\dots,x_n)}\).

If the quantifier elimination process can be described algorithmically, then decidability of sentences in the logic reduces to decidability of quantifier-free sentences which is often a much easier question. (Note though that algorithmic quantifier elimination of formulas is a stronger condition than decidability of sentences.)

2. Quantifier Elimination over the Reals

The real numbers, being an infinite system, cannot be exactly axiomatized using first-order logic because of the Löwenheim-Skolem Theorem. But the axioms of ordered fields along with the intermediate value theorem yields a natural first-order logic, called the real closed field. The real closed field has the same first-order properties as the reals.

Tarski (1951) showed that the real closed field admits quantifier elimination. As a consequence, one has the following (Seidenberg was responsible for popularizing Tarski’s result):

Theorem 2 (Tarski-Seidenberg) Suppose a formula \({\phi(y_1,\dots,y_m)}\) over the real closed field is of the following form: \begin{equation*} Q_1x_1~Q_2x_2~\cdots Q_nx_n~(\rho(y_1,\dots,y_m,x_1,\dots,x_n)) \end{equation*} where \({Q_i \in \{\exists,\forall\}}\) and \({\rho}\) is a boolean combination of equalities and inequalities of the form: \(\displaystyle f_i(y_1,\dots,y_m,x_1,\dots,x_n) = 0\)

\(\displaystyle g_i(y_1,\dots,y_m,x_1,\dots,x_n) > 0\)

where each \({f_i}\) and \({g_i}\) is a polynomial with coefficients in \({{\mathbb R}}\), mapping \({{\mathbb R}^{m+n}}\) to \({{\mathbb R}}\). Then, one can explicitly construct a logically equivalent formula \({\phi’(y_1, \dots, y_m)}\) of the same form but quantifier-free. Moreover, there is a proof of the equivalence that uses only the axioms of ordered fields and the intermediate value theorem for polynomials.

I should be more explicit about what “explicitly construct” means. Usually, it is assumed that the coefficients of the polynomials \({f_i}\) and \({g_i}\) are integers, so that there is a bound on the complexity of the quantifier elimination algorithm in terms of the size of the largest coefficient. (If the coefficients are integers, all the computations only involve integers.) But, even when the coefficients are real numbers, there is a bound on the “arithmetic complexity” of the algorithm, meaning the number of arithmetic operations \({+, -, \times, \div}\) performed with infinite precision.

A quick example. Suppose we are given an \({r}\)-by-\({s}\) matrix \({M = (M_{i,j})}\), and we’d like to find out whether the rows of \({M}\) are linearly dependent, i.e. the following condition:

\(\displaystyle \exists \lambda_1, \dots, \lambda_r \left( \neg \left(\bigwedge_{i=1}^r (\lambda_i = 0)\right) \wedge \bigwedge_{j=1}^s (\lambda_1 M_{1,j}+\lambda_2 M_{2,j} + \cdots \lambda_m M_{r,j} = 0) \right)\)

Tarski-Seidenberg immediately asserts an algorithm to transform the above formula into one that does not involve the \({\lambda_i}\)’s, only the entries of the matrix. Of course, for this particular example, we have an extremely efficient algorithm (Gaussian elimination) but quantifier elimination gives a much more generic explanation for the decidability of the problem.

2.1. Semialgebraic sets

If \({m=0}\) in Theorem 2, then Tarski-Seidenberg produces a proposition with no variables that can then be evaluated directly, such as in the linear dependency example. This shows that the feasibility of semialgebraic setsis a decidable problem. A semialgebraic set \({S}\) is a finite union of sets of the form:

\(\displaystyle \{x \in {\mathbb R}^n~|~f_i(x) = 0, g_j(x) > 0 \text{ for all }i=1,\dots, \ell_1, j= 1,\dots, \ell_2\}\)

where \({f_1,\dots, f_{\ell_1}, g_1,\dots, g_{\ell_2}: {\mathbb R}^n \rightarrow {\mathbb R}}\) are \({n}\)-variate polynomials over the reals. The feasibility problem for a semialgebraic set \({S}\) is deciding if \({S}\) is empty or not. An algorithm to decide feasibility directly follows from applying quantifier elimination.

Note that this is in stark contrast to the first-order theory over the integers for which Godel’s Incompleteness Theorem shows undecidability. Also, over the rationals, Robinson proved undecidability (in her Ph.D. thesis advised by Tarski) by showing how to express any first-order sentence over the integers as a first-order sentence over the rationals. This latter step now has several different proofs. Poonen has written a nice survey of (un)decidability of first-order theories over various domains.

2.2. Efficiency

What about the complexity of quantifier elimination over the reals? The algorithm proposed by Tarski does not even show complexity that is elementary recursive! The situation was much improved by Collins (1975) who gave a doubly-exponential algorithm using a technique called “cylindrical algebraic decomposition”. More precisely, the running time of the algorithm is \({poly(C,(\ell d)^{2^{O(n)}})}\), where \({C}\) measures the size of the largest coefficient in the polynomials, \({\ell}\) is the number of polynomials, \({d}\) is the maximum degree of the polynomials, and \({n}\) is the total number of variables.

A more detailed understanding of the complexity of quantifier elimination emerged from an important work of Ben-Or, Kozen and Reif (1986). Their main contribution was an ingenious polynomial time algorithm to test the consistency of univariate polynomial constraints: given a set of univariate polynomials \({{f_i}}\) and a system of constraints of the form \({f_i(x) \leq 0}\), \({f_i(x) = 0}\) or \({f_i(x) > 0}\), does the system have a solution \({x}\)? Such an algorithm exists despite the fact that it’s not known how to efficiently find an \({x}\) that makes the signs of the polynomials attain a given configuration. Ben-Or, Kozen and Reif also claimed an extension of their method to multivariate polynomials, but this analysis was later found to be flawed.

Nevertheless, subsequent works have found clever ways to reduce to the univariate case. These more recent papers have shown that one can make the complexity of the quantification elimination algorithm be only singly exponential if the number of quantifier alternations (number of switches between \({\exists}\) and \({\forall}\)) is bounded or if the model of computation is parallel. See Basu (1999) for the current record and for references to prior work.

As for lower bounds, Davenport and Heintz (1988) showed that doubly-exponential time is required for quantifier elimination, by explicitly constructing formulas for which the length of the quantifier-free expression blows up doubly-exponentially. Brown and Davenport (2007) showed that the doubly exponential dependence is necessary even when all the polynomials in the first order formula are linear and there is only one free variable. I do not know if a doubly exponential lower bound is known for the decision problem when there are no free variables.

Thanks to Ankur for helpful suggestions. Part II of this post will contain a relatively quick proof of Tarski’s theorem! Stay tuned.

 

Alfred Tarski (1951). A Decision Method for Elementary Algebra and Geometry Rand Corporation
Technical_Introduction  algorithms  logic  real_numbers  from google
november 2011 by josephzizys
Who’s Afraid Of Arrow’s Paradox?
How to decide, when you really have to decide.

Kenneth Arrow was a co-recipient of the 1972 Nobel Memorial Prize in Economics—note this is a “Nobel” prize, but is a bit different from the original ones. It still is immensely prestigious, but it is a more recent creation. Arrow is famous for his impossibility result, which shows that there is no “good” voting system.

Today I want to talk about what should we do if we have to vote?

I often find myself on committees whose sole purpose is to make a multiple-choice decision, meaning to select choices from a larger set of choices. This happens for award committees, for membership committees, for program committees, and for various other committees. The key is that the committee is charged with selecting more than one choice from a large number of choices.

I also often find myself on committees whose sole purpose is unclear—perhaps there is no purpose. While I was at Princeton an administrator called a meeting that Hector-Garcia Molina and I had to attend. We ran into the administrator, a few days before his meeting, and after saying hello Hector asked, “what is the agenda of the upcoming meeting?” The answer was “two hours.” Unfortunately this was a short clear answer that was honest and correct.

The Voting Dilemma

Back to committees that do make decisions. The famous Arrow Theorem, see below if you are not familiar with it, shows that there is no voting system that really works. All systems can lead to “paradoxical” outcomes. But decisions need to be made: awards have to be given, members selected, papers accepted, and so on. The puzzle is:

What should we do in practice?

My experience on committees is very mixed. Some seem to operate in a rational manner, while others use strange ad-hoc techniques to make their decisions. The usual approach is to rank the choices—let’s call them candidates—and then select the top ones and drop the bottom ones. This initial selection can often be done with fairly good agreement—of course the ones on the border of that can still be hard. Then the committee has the difficult job of selecting from the remaining candidates. They are usually very close, and it can be extremely hard to make the final choices.

This final step of deciding on the “middle” candidates is where the strange techniques appear. My experience is that the chair usually has a pet or in-house tie-breaking scheme. It may be a voting method based on points, for instance your first choice gets 1 point, the second 2 points, and so on, with the lowest ones being selected. Sometimes voting methods are created on the fly—a member of the committee will say, “why don’t we ”

I know that there is no “best” method, and all methods can be “gamed,” but I still wonder what we should do? Are there in some sense methods that are better than others?

Arrow’s Theorem

Here are three “fairness” criteria that would seem to be axiomatic in any voting system:

Unanimity: If every member prefers candidate over candidate , then the committee prefers over .
Non-dictatorship: No single member possesses the power to determine the commitee’s preference in all cases.
Focus: If every member’s preference between and remains unchanged, then the committee’s preference between and will also remain unchanged, regardless of any changes in preference between pairs or or with .

The first two seem especially basic. If there were a dictator, then perhaps we could just skip the meeting. Perhaps no, since the meeting still has an agenda, even if the agenda is just “two hours.” The third is more commonly called the “Independence of Irrelevant Alternatives” (IIA) axiom. It sounds reasonable because surely the group preference for versus should depend only on the merits of what each individual thinks between and , no? No:

Theorem: When there are at least two committee members and at least three candidates, there does not exist a system of voting that meets the above three axioms.

Formally, let be the set of permutations of candidates, interpreted as rankings, and let be the number of committee members. Then a voting system is a map from to . Let be the subset of in which is ranked higher than and its complement. For any , define

where if and if . Then Arrow’s Theorem states that if always maps into and does not depend on any single argument, then there must exist such that the range of restricted to is not wholly contained in either or .

There is a related theorem by Allan Gibbard and Mark Satterthwaite that extends to systems that work by members giving points rather than preference rankings. Even if the goal is to select just a single winner, the theorem says that every system that always selects when every committee member gives highest points, and that isn’t dictated by a single committee member, is susceptible to being “gamed” by a member seeking an outcome different from the ranking given by the member’s points.

The Aggravation of Aggregation

The root of all the difficulty is well known. To avoid tiebreak issues we regard not as the minimal number of committee members, and consider candidates . The problem is what to do if the members’ permutations are , , and . Then a majority prefer , a majority prefer , and a majority prefer . Who should win?

Related issues involve parliamentary systems that allocate seats to parties based on proportions of the total vote. If you follow the quota rule of giving each party either the integer floor or the integer ceiling of its share of the vote times the total number of seats, then non-monotone situations can arise where a few extra votes or an increase in can cause a party to lose a seat. These situations also affect the number of Congressmen given to states after each U.S. Census.

In committee situations, of course, one usually weeds out the lowest choices altogether. Then the problem becomes, how to re-evaluate the rankings or points given by the members who preferred them? This leads to the issue of runoff methods for elections with more than two candidates, which are also fraught with paradoxes.

For one example, in the women’s figure skating final group at the 2002 Winter Olympics in Salt Lake City, Sarah Hughes stormed from 4th place to win ahead of co-favorite Michelle Kwan, who was pushed to 3rd behind Irina Slutskaya. This article explains that had Slutskaya suffered one more fall, Kwan would have won the gold, despite the feeling based on IIA that how badly a third skater performed should not disturb the focus between Kwan and Hughes. (This had nothing to do with the ice-dancing judging scandal from the same Olympics, but did play into the creation of the new ISU Points System in 2004.)

A term more general than voting for the problem is aggregation of multiple preferences into one rank order. Every time one searches on Google, one sees the rank result of an aggregate score based on a “commmittee” of page-rating criteria. Google has evidently chosen an aggregation method that suits its own criteria, at least most of the time. That leads to our query: can we identify situations for committees in which there does exist a procedure that is (nearly) always satisfactory?

Voting Methods

There are a vast array of voting methods that committees can use. All can have paradoxes, but all have some properties that make them interesting. Many are used in real situations, but the fact that the list of methods is so long suggest that this is far from a solved problem. Here are some:

Highest averages
D’Hondt method
Sainte-Laguë method
Largest remainder
Hare quota
Droop quota
Wright system
Parallel voting

This problem has been studied through the centuries, and even Lewis Carroll contributed theory to it. What we ask is whether there has been a good mapping of methods to categories of applications where they are most appropriate. We note for instance this paper by Cynthia Dwork, Ravi Kumar, Moni Naor, and Ken’s former student D. Sivakumar while he was at Google. This proposes using Markov Chains to combine the initially-given , followed by a transformation they call “local Kemenization,” and gives an application for identifying spam.

Open Problems

How should we decide when we are on a committee and are faced with a real decision?
History  People  Proofs  Algorithms  paradox  Problems  voting  from google
november 2011 by josephzizys
An Annoying Open Problem
How do we tell if two groups are isomorphic?

Vikraman Arvind and Jacobo Torán are both complexity theorists, both with wide interests, but both who have worked extensively on the graph isomorphism problem. See their survey here, for example.

Today I want to talk about about an open problem, and their paper on the problem, which makes some progress. But the problem is still open—I find this an embarrassment—we should be able to solve this problem.

The problem is: Given two groups defined by tables, are they isomorphic? This is the - problem. What is annoyingly open is whether this problem belongs to , or alternatively, whether it is equivalent to Graph Isomorphism.

A Cayley table, after the 19th century British mathematician Arthur Cayley, is just the table of the multiplication function

There are many ways to represent a finite group. The above is natural: just give the multiplication table. There are many others: we could give the group as a “black box,” as a set of generators and relationships, or as a set of matrices. It should be clear that the easiest representation to deal with must be Cayley tables. Many questions that are hard in other representations are easy, even simple, in this representation. Note that this representation is larger than the size of the group itself, basically its square. The other representations, by contrast, are more succinct.

The Problem

The precise definition of the - problem is: given two tables that define the groups and , are they isomorphic? That is, is there a bijective mapping from the elements of to the elements of , say , so that for all in :

This is a fancy way of saying that except for the names of the elements the groups are the same.

The Short History

I find this problem very annoying. I have spent uncountable hours working on this over the last decades, especially jointly with Zeke Zalcstein. One reason the problem is so annoying is that it is easy to prove that it can be reduced to Graph Isomorphism (). But surely the additional structure of groups must help make the problem much easier to solve. Here are some encouraging differences between groups and graphs:

Every subset of elements of a graph is a graph; only certain subsets of elements of a group form a group.
Groups have a tight structure that varies with the prime factorization of , the number of elements. For instance, if is a prime there is exactly one group—the cyclic group .
Groups always have automorphisms, provided . Graphs can have many automorphisms—for instance the complete graph has —or they can have as little as no nontrivial automorphisms.
I could go on and on with the many structural differences.

Yet for all these differences, the best Zeke and I could do was prove, in 1976, that - can be solved in space . Bob Tarjan independently noticed at the same time that it could be done in time. Both results are really the same: they both exploit the fact that a group of elements always has a generating set of size . This is a direct consequence of Lagrange’s Theorem, as follows:

Let be a group with elements. Let the trivial group that contains only the identity element of . If is all of , then stop and we have found generators. If is not add any element of that is not in . By Lagrange’s Theorem this must have at least twice as many elements as : call it . The last point is key: let have elements and have elements. Clearly . But , which implies that .

Complexity Approach

One of the success stories of complexity theory, and potentially a great story, is its application to solve the - question—almost. The well-known Merlin-Arthur protocol for graph isomorphism works also for group isomorphism—see here and here for details.

What Arvind and Torán prove is that at least for a large class of groups, namely the solvable groups, the - problem is almost in . Namely, there is an -machine for the complement of this language that works correctly on all but a quasi-polynomial number of inputs of length . This may not sound very impressive, but it is. By the famous Feit-Thompson theorem, all groups of odd order are solvable. Note that solvable groups play an important role in others ways in complexity theory, which I have discussed a number of times before.

What they actually first show is that the complementary task, --, has an Arthur-Merlin protocol in which Arthur uses only random bits and Merlin uses only nondeterminism. This makes various kinds of de-randomization easier to apply. The existence of a language in that is bi-immune to polynomial space, i.e. the assumption , suffices to de-randomize this completely and thus put - into .

The Group Approach

Recall that groups come in various types: there are simpsle groups, there are solvable groups, there are abelian groups, and more. One type of groups that are very important are called -groups. A group is a -group if the order of the group—the number of elements in the group—is where is a prime and . Every such group is solvable.

One of the outstanding issues in group theory is that there is no general structure theorem for -groups. They are very complex, and even understanding the structure of all such groups of size for modest values of is non-trivial.

The groups of order for were classified early in the history of group theory, and modern work has extended these classifications to groups whose order divides . According to our friends at Wikepedia–see here:

the sheer number of families of such groups grows so quickly that further classifications along these lines are judged difficult for the human mind to comprehend.

The number of different, non-isomorphic, -groups is exponential:

As a complexity theorist I would disagree a bit with the idea that many implies hard to understand. There are many boolean strings of length , but they are not that hard to understand in some sense. What makes -groups hard, is that they can have complex structure, and we do not understand it, at all.

Zeke and I worked hard trying to understand these groups enough so that we could get a better isomorphism algorithm. We never made any progress. I would be excited if we could even prove there is an algorithm that runs in

for -groups.

Open Problems

Please solve the - problem. Or at least break below the time, which is the best known now for decades. Can you prove

for some ? Good hunting.
P=NP  People  Proofs  Algorithms  AM  coNP  groups  isomorphism  NP  Problems  solvable  from google
october 2011 by josephzizys
Ask Bob
A new game that helps us understand the structure of .

Dieter van Melkebeek is a theorist at the University of Wisconsin at Madison. He is a past winner of the prestigious 1999 ACM dissertation award—I believe no one has ever won this award more than once. He has proved many wonderful theorems and is one of the world experts in complexity theory.

Today I want to talk about a beautiful new game that he has created for Alice and Bob. The game is based on , one of my favorite problems, as you may know.

The Bob in this game harks back to a vintage page of the ’90s. Vintage ’90s used to mean 1890′s, but now there is a museum of old webpages from the 1990′s. You can ask the Mystical Smoking Head of Bob a question, any question. He is all-powerful, but Alice wants to minimize her interaction with him.

I have had the pleasure to work on one project jointly with Dieter, and while that result was eclipsed by a result of Ryan Williams, I still consider our result a stepping stone for Ryan. Dieter just visited Georgia Tech and gave a talk in our theory seminar funded by the ARC center. It was a beautiful talk, about a great result, which I will now proceed to explain.

The Game for SAT

The game is played by two players: Alice and Bob, who else? They communicate back and forth, but only Alice sees the input, which is a set of -clauses: for example,

Some notation: will always stand for the number of variables and for the number of clauses.

I sometimes wonder whether theory research would stop if the letter “” were banned from our papers? In number theory a complex number is usually denoted by

and I believe that this notation is due to the eminent Bernhard Riemann himself. In his famous paper that introduced the zeta function, he wrote for a complex number on his famous line . Ken finds this notation macaronic for mixing Latin and Greek, when might seem more logical, but some have joked that Riemann used this notation because he forgot what Greek letter comes after .

The title of Riemann’s paper is: “Uber die Anzahl der Primzahlen unter einer gegebenen Grösse, or in English: On the Number of Primes Less Than a Given Magnitude. See here for a translation of the beautiful paper that changed number theory forever.

Back To The Game

Again the game is based on giving Alice a set of -clauses:

Recall is the number of clauses and these clauses are over the variables .

Her task is to determine whether or not they are satisfiable, which means that there is some assignment to the variables of boolean values , so that each clause evaluates to .

As is often the case one player, in this case Alice, must run in polynomial time. Since there is no known algorithm that always runs in polynomial time for , Alice has a tough task. So far the game seems pretty unfair, and pretty boring—its just the satisfaction problem. Nothing new.

But wait: she has the ability to ask Bob for help. Bob, unlike Alice in this game, is all-powerful: Bob can compute or determine anything that he wishes. Alice is allowed to ask questions by sending messages to Bob, who in turn sends back the correct answer—there is no issue of Bob being untruthful or cheating. Bob, in this game, is always going to answer the questions sent by Alice correctly.

That is the new game introduced by Holger Dell and Dieter in their paper. Pretty neat, no? The game so far is still uninteresting, we need to add one ingredient that makes it exciting:

Alice is charged by the number of bits she sends to Bob, and her task is to send as few bits as possible.

Now the game is interesting, very interesting. Without this restriction Alice could simply send all the clauses

to Bob, who would decide whether or not they were satisfiable, and then return one bit to Alice. With this restriction the game is much harder for Alice, and it is completely unclear what she should do? How many bits does she need in worst case to send to Bob? Note: Alice is only concerned with the length of the messages she sends to Bob, there is no restriction on the length of messages from Bob. Of course, Bob cannot send more than a polynomial number of total bits to Alice, since otherwise she would not run in polynomial time—just reading the messages would take too long.

Strategies For Alice

What does Alice do? In the initial version of the game—the free version—Alice is not allowed randomness. In the advanced version she is allowed to use randomness, but for now let’s restrict our attention to strategies that are deterministic.

Alice could just send Bob all the clauses, but encoded cleverly. There are a total of

possible clauses. So Alice could send Bob a bit vector of this length with a in the position if she has that clause. This is better than the naive encoding, but in worst case can be order . Can Alice do better?

Alice could start to send clauses to Bob, one at a time. Each time she asks him whether or not the clauses he has seen so far have more than a polynomial number of satisfying assignments. If they ever have only a polynomial number, then Bob will send all of them to Alice. She then can check the rest of the clauses. A reasonable strategy: Alice is being conservative and not sending all the clauses at once. But it is not hard to create examples where this will require her to send all the clauses to Bob. Thus it will send over bits to Bob, actually order , so it is worse than the first strategy.

Alice can allow Bob to take the lead. For example, Bob could send Alice a polynomial size string with Kolmogorov complexity near the length of . Alice, being polynomial time limited, has no hope of constructing such a string, but Bob is all powerful and can send her such string. The work of Eric Allender, Harry Buhrman, Michal Koucky, Dieter, and Detlef Ronneburger show that having access to such strings can be very powerful. But it turns out that even with this help, Alice cannot determine the state of her clauses in polynomial time.

Alice could Let’s stop here. You can think of your own strategies that are better than the ones that I suggested.

The Theorem

Dell and Dieter prove in their paper the following beautiful theorem:

Theorem: For any , there is no strategy for Alice and Bob that allows Alice to send, in worst case, fewer than bits of messages, unless the polynomial time hierarchy collapses.

Thus, up to an “epsilon” Alice’s best strategy is the simple one: send all the clauses to Bob, let him decide if they are satisfiable and wait for the answer to come back from Bob.

I find this surprising. It seems that there should be a better strategy for Alice, yet the theorem shows that no matter how she twists and turns, she must send about the same number of bits as the naive strategy. A pretty remarkable theorem. I will not attempt to describe the proof for two reasons. The usual: see their paper for the details—they explain it much better than I can. And the real reason—I do not yet understand the proof well enough to outline the main ideas. Perhaps I can do that in the future.

Open Problems

It the bound tight? Can the lower bound be improved? Or is there a strategy that slightly beats the naive one? Another very pretty open question is what happens when Alice is allowed randomness and the ability to make two-sided errors. That is Alice can be wrong either when she says yes or no: she must of course be right most of the time. Dieter and I discussed this question and it seems quite interesting.

Finally, there are two ways to view the game. One is that it shows that is incompressible, and the other is that the game could be a pathway to a proof that the polynomial hierarchy really does collapse. Of course the conventional wisdom is that the hierarchy is strict, but as I have said many times before: who knows? What Dieter has supplied is an intriguing new approach to show the hierarchy does indeed collapse: find a strategy for Alice that sends only total bits. Good hunting.
P=NP  People  Proofs  Algorithms  Alice  award  Bob  complexity  game  P≠NP  from google
september 2011 by josephzizys
Can Quantum Machines Do It All?
What keeps SAT out of BQP?

Dan Boneh is an expert on cryptography, especially the use of advanced number theory, such as the Weil pairing. He is famous for his seminal work on identity based encryption with Matt Franklin, which uses the Weil pairing.

Today I want to recall an old result Dan and I proved years ago and mix in a new result.

When Peter Shor announced in 1994 his instantly famous result that there is a quantum factoring algorithm that runs in polynomial time, many of us were amazed. We quickly read his paper, tried to understand why it worked, and followed the proofs step-by-step. It was—is still—a beautiful result.

Yet there was something about the result that troubled me. Perhaps no one else was troubled, but I did not really understand why the theorem was true. Yes I could follow the steps, yes I could explain the result to my students, but I did not really understand it.

I have discussed this phenomenon before: proofs are not just a mechanical means to move a statement from the conjecture pile to the proved pile. No, proofs are much more—they need to convey why the result is really true. They need to not seem magical, which Peter’s proof seemed to me. Proofs need to explain what is really happening.

Now I will say that Scott Aaronson’s well-noted explanation does this for the central idea, but back then Dan and I had no Scott to guide us. So we did something else.

In order to understand what was going on, Dan and I decided the best way would be to try and generalize Peter’s theorem. This is a standard method in mathematics: one way to really see what is happening is to generalize a theorem. Doing this can often shed new light on why the original theorem is true, and also can be interesting in its own right.

I could give countless examples of this method, it is used so often in math. Here is an example from number theory that is later explained better—I believe—by a classic result from group theory. Consider Fermat’s Little Theorem:

Theorem:
If is a prime and is an integer not divisible by , then

There are many proofs of this fundamental theorem. See our friends at Wikipedia here for several proofs: They have proofs based on: counting necklaces, using modular arithmetic, using the binomial theorem, and using dynamical systems.

Each proof gives a different insight into why must be prime, why cannot divide , and why the theorem is true.

The proof that I personally feel best explains what is happening is the one based on Lagrange’s Theorem:

Theorem: In any finite group, the order of an element divides the order of the group.

As an aside Joseph Lagrange did not prove his theorem, but only discussed special cases in 1771. It was finally proved by Pietro Abbati and published in 1803. The proof of Fermat’s Little Theorem now follows since when is prime, the set of numbers

form a group under multiplication modulo .

Hidden Linear Functions

Dan and I began to look at what we called the hidden linear problem. A function from has period provided

for all . We say that is m-to-one provided at most integers in the set map to the same value in . Note, if is one-to-one, then it is m-to-one with .

Let be a function from the integers to some arbitrary set . Say that f has hidden linear structure over provided there are integers and some function of period so that

for all integers . Further we say that is m-to-one provided is.

Our conjecture, which eventually became a theorem is:

Theorem:
Suppose that is a function that has a hidden linear structure over which is at most m-to-one. Assume the conditions:

Let , so that and are at most polynomial in .

Let be the smallest prime divisor of , so that .

For such a function , in quantum polynomial time in we can recover the values of modulo from an oracle for .

Note that the added conditions are critical. If the function were constant then there would be no way to reconstruct the ‘s: there must be some limit on how much it collapses. The second condition makes the values of the ‘s make sense modulo .

Look at our paper for details. The above theorem is strong enough to prove both factoring and discrete logarithm are in polynomial quantum time. This already indicated that we had made some progress. In Shor’s paper he uses different but similar arguments to handle factoring and the discrete logarithm. I personally thought that our proof helped me have a greater understanding of Shor’s brilliant work.

By the way, proving this theorem was hard. We used the same type of quantum algorithm that Peter did. But the fact that was not one-to-one created serious difficulties for us. Sometimes relaxing from a constant to `polynomial’ is a walk in the park, but not this time… I recall many times that one of us wanted to give up, but eventually we found the key trick. In the one-to-one case—the case that occurs in Peter’s theorems—there is an exponential sum that must be estimated. This sum is quite easy to handle. However, in the poly-to-one case the sum that arose was much harder to bound. We finally found a trick: when trying to bound a sum, change the sum. More on this another time.

Quantum Detection of Periods

Dan and I could also prove the following theorem:

Theorem:
Suppose that has period , and no smaller period. Also let be m-to-one. Add the conditions:

Let , so that and are at most polynomial in .

Let be the smallest prime divisor of , so that .

For such a function , in quantum polynomial time in we can recover the the period .

In Shor’s paper this theorem is essentially proved with : the so called bijective case. Obviously our result is a modest generalization to the case where the function can fail to be one-to-one, but a value in the range can only have a polynomial number of pre-images. Since then there have been much better improvements to the period solving problem. The current best results can recover the period provided the function has small auto-correlation—see this for details. Indeed for quantum algorithms that only use as an oracle, the current results are essentially best possible.

SAT

There is another reason that there may be no quantum algorithm for solving the general period finding problem. That is, given a small circuit—not an oracle—for the function , find its period. The reason is that the following is a theorem:

Theorem: The following two problems are equivalent via a random reduction:

Finding an assignment to a instance.

Finding the period of a Boolean valued function given by a small circuit.

Note, the lower bounds of quantum complexity theory are mostly based on a black box argument—they assume that only oracle access is available to the function whose period you want. There is no way known to show that they apply when the function is given by a circuit. This follows since if one could guess the period.

This will be an example of “evolving” a theorem from a trivial beginning to more-meaningful forms. At issue first are: how small is `small’ and what is the time on the reduction? Then the main issue will become, how close are the resulting cases of period-finding to ones that quantum algorithms can handle?

First we prove a version where `small’ means polynomial, the reduction is deterministic polynomial time, and the cases of periodicity involved are trivial. Namely, given an instance formula for , let be the circuit that on input an assignment outputs . If is not satisfiable then the function computed by is identically zero and so has period . If is true on the all- assignment then we say yes. Else, is satisfiable, so but for some other , so does not have period . It may not have any period, which is what makes this trivial. But we certainly know has period .

Amplifying Periods

Now we want always to have a period, in the case where . This can be done by a simple padding trick. Let us use some number of variables defining a value , and define the circuit by . Then the function always has a period dividing —the question is whether that period is or something larger.

Thus finding the period of a periodic function given by poly-size circuits is -hard. Of course this is far from bijective—it is just 0-1 valued. If we could make bijective on this period, at least when the formula is satisfiable, then we would put into . How close can we come?

We start by arranging that when , with high probability at least one we run across has a period of some prime order , where finding for this is enough to say . Take the above circuit ; we wish to see if it has an input so that . We can assume that is unique, if it exists by using the famous result of Leslie Valiant and Vijay Vazirani.

A further simple random argument shows that we can also assume that the solution is a prime number. Here’s why: Consider flipping each input bit of independently: since the proportion of primes up to is order-of , this will map the to a prime number with reasonable probability, and we only need that our test succeeds on it once. In summary we can assume that is either unsatisfiable or has some prime as the unique solution.

Now let be the function

where is over all prime divisors of . Then it is easy to see the following: If is unsatisfiable, then is identically and has period . If there is a unique prime so that , then has period . Therefore we have reduced solving to period finding—where we can restrict attention to positive cases where the period has prime length and the function is except for a on the period start.

There is a small point, really a BIG point: the function does not have a poly-size circuit. In order to compute it we must factor into it prime factors and then check each one, , to see it makes . However, factoring, currently, has algorithms that run in ti[…]
P=NP  People  Proofs  Algorithms  new  periods  quantum  SAT  from google
september 2011 by josephzizys
Complexity as Enabler
How to be a good cop on the problem-solving beat

Guy Blelloch is best known for his seminal work on parallel computing; if you need advice on parallelism he is the one to ask. Not surprisingly, he is a strong advocate of parallel programming. See his essay, “Is Parallel Programming Hard?” Spoiler alert: he argues that it is not.

Today we wish to talk about complexity theory as an enabler of problem solving, rather than a disabler when a problem is proved hard for or some other complexity class of difficult problems.

Blelloch’s approach emphasizes how parallel structure emerges from specific problems, rather than focusing on control in parallel machine systems. The algorithms realizing this structure are informed by computational complexity theory. For instance, the “scan programs” in the book of his Ph.D. thesis are influenced by theorems relating vector machine models to the circuit classes of the NC hierarchy.

Blelloch has taught a course at Carnegie Mellon titled “Algorithms in the Real World” for many years. This covers classic algorithms in text compression, string searching, computational biology, high-dimensional geometry, linear versus integer programming, cryptography, and others. Some of these areas are informed by complexity, such as -hardness, reduction to and from factoring, and hardness of approximation.

However, even with immediately practical tasks like computing Nash equilibria of games, which as we noted in the last post is complete for a class called , or pricing financial derivatives as discussed and shown hard here, complexity is playing the “bad cop,” policing attempts to find feasible exact algorithms where ostensibly there aren’t any. Ken’s colleagues Hung Ngo and Atri Rudra are developing a seminar to showcase complexity as the “good cop,” enabling applications. The rest of this post is by Atri Rudra.

An Early Example

The earliest example I, Atri, know is the now famous pattern matching algorithm by Donald Knuth, James Morris, and Vaughan Pratt, which is included in Blelloch’s course. The point is that automata theory was used to design a linear time pattern matching algorithm. The following is quote is from the “Historical Remarks” section of the original paper:

“This was the first time in Knuth’s experience that automata theory had taught him how to solve a real programming problem better than he could solve it before.”

This whole section is worth reading to see how Knuth’s version of the algorithm was implicit in Stephen Cook’s linear time simulation of deterministic two-way pushdown automata (see this reference), which is clearly a complexity result. What could a “deterministic two-way pushdown automaton” have to do with a practical algorithm? Plenty.

Automata theory in itself is a treasure-trove of practical applications. For instance, the ubiquitous regular expressions and parsers in compilers, as they exist today, would not be around without automata theory.

Making Lower Bounds Your Friend

Probably the best known use of complexity as an enabler is to change the rules of the game: use the hardness results to foil the adversary. Cryptography has done this with great effect. There are other similar works, e.g, recently there has been work on designing elections that cannot be manipulated under complexity assumptions: see the survey by Piotr Faliszewski, Edith Hemaspaandra and Lane Hemaspaandra.

For the rest of the post, I’ll go back to the good cop role of complexity. A small bit of warning though: when I talk about practical applications below, I will not be talking about results that can find themselves in the complexity equivalent of the Algorithms in The Field Workshop—rather I will be content to talk about applications where one uses complexity to solve problems that have a clear and direct practical motivation. Given that I have ruled out cryptography, I hope you will allow me this luxury.

Applications of Complexity-Based Tools

Complexity theory has developed many tools that can and should find applications in practical areas where on the surface the two do not seem to have anything in common. Below the surface are beautiful connections that are being exploited can be developed further. Let me try to substantiate this with the following examples:

Unbalanced bipartite expanders. An expander is a sparse graphs in which any subset of vertices that isn’t too big has many edges leaving it, or alternately has many neighboring vertices. These objects in general have many applications both in complexity theory as well as other practical areas—see for instance this great survey by Shlomo Hoory, Nathan Linial, and Avi Wigderson. I picked unbalanced bipartite expanders for my seminar, as they have been developed almost solely by complexity theorists. Unbalanced means that one side of the bipartite graph has many more vertices than the other. These objects have been used to construct compressed sensing matrices, as noted in this survey by Anna Gilbert and Piotr Indyk.

The particular version of the compressed sensing problem is to design a matrix that is long and skinny such that for any real-valued vector ; given one can recover a vector such that

Here is the vector with all but the largest components in have been zeroed out and is the approximation factor. When is the adjacency matrix of a lossless unbalanced bipartite expander, one can obtain arbitrarily close to , and one can efficiently reconstruct form . A general resource page on compressed sensing shows other applications.

Randomness extractors. Pseudo-randomness leads to another great place to look for applications besides expanders, namely extractors, which enable refining the output of partially random sources down to pure uniform bits. Indeed, the best known constructions of extractors, which is due to Venkat Guruswami, Chris Umans, and Salil Vadhan use unbalanced bipartite expanders. Michael Mitzenmacher and Vadhan used these techniques also to demonstrate why limited-wise independent hash functions seem to work as well as what theory predicts for fully independent hash functions in practical applications, owing to the fact that data often has enough inherent partial randomness.

List decoding. Oded Goldreich and Leonid Levin found the first applications of list decoding in complexity theory itself, and this opened the way to much algorithmic progress in list decoding being made by complexity theorists, as surveyed here. List decoding has the potential to correct more errors in practice than traditional decoding, as hailed in this August, 2004 NSF item.

List decoding has also been applied to IP traceback of denial-of-service attacks, to games of guessing secrets, and to verifying truthful storage of client data. The last paper also applies Kolmogorov complexity—disclaimer: this paper is joint work with (my colleague Ram Sridhar’s student) Mohammad “Ifte” Husain, my colleague Steve Ko, and my student Steve Uurtamo.

Locally decodable codes. These are codes where one can compute an arbitrary message symbol with high probability by querying very small number of (random) positions in a corrupted codeword. Complexity theorists can take complete credit for this coding primitive. Along with locally testable codes, these coding objects came out of the work on probabilistically checkable proofs (PCPs). Unlike list decoding, the theoretical limits on locally decodable codes are still not known, though there has been some recent progress starting with this breakthrough work of Sergey Yekhanin. Typically these codes are studied in the regime of a constant fraction of errors, where the amount of redundancy needed in the code has to be super-linear. Recently, locally decodable codes for a single error have been studied by Yekhanin with Parikhshit Gopalan, Cheng Huang, and Huseyin Simitci. This work is tailored for distributed remote storage systems where the number of errors is small and local decoding translates to lower communication between servers needed to recreate the server that is down.

Superconcentrators. Superconcentrators are sparse graphs where certain designated set of input and output nodes are connected by many node-disjoint paths. (One can construct superconcentrators using expanders.) These objects were originally studied by Leslie Valiant and others in the context of complexity lower bounds. This was also inspired by connections to switching networks, namely routing boxes that switch traffic from one fiber to another, and for work showing the connections go both ways, see here and here and here. See also this post on a problem about routing network traffic on expanders.

Models

Complexity theory studies computation through the lens of various machine and program models. I believe that these models themselves can be a valuable export to practice. Probably the most widely exported model is that of communication complexity, which has found applications in diverse areas such as data stream algorithms and game theory. However, these are again used to prove lower bounds, on the bad-cop’s beat. Below are some examples I am aware of that prove positive results:

Natural Computational Models. The idea here is to model natural phenomena using computational ideas, and then to use complexity tools to prove interesting results. One recent example is Valiant’s new research on quantitative aspects of evolution. This and follow-up work bring in tools developed in computational learning theory. Another example is this work by Nikhil Devanur and Lance Fortnow developing a computational model of awareness and decision making, which uses Leonid Levin’s foundational complexity concept of universal enumeration.

Restricted Models. The idea is to take restricted computational models that have already been studied in complexity, and adapt them to model practical applications where upper […]
1  Ideas  Oldies  Algorithms  complexity  from google
september 2011 by josephzizys

Copy this bookmark:



description:


tags: