1/13/10

Understanding SAT

This is an old hobby, just skimming the content of this. Not sure I like the direction satsolving is in. Basically, satsolving boils down to translating boolean terms to CNF and using heuristics on a DPLL solver to solve a problem. CNF is a bit more unknown than DNF which is more used in industry, but very well known by complexity theorists who will often reduce from 3-SAT.



What is good about this research, is that solvers are very fast, since CNF is very 'computer friendly.' What I find bad about it is that the CNF form of formulas often 'hide' what a solver actually does. I.e., often the behavior of a satsolver is better understood as a 'pebbling game,' not the mathematical term, or a 'partitioning' on the original formula, or even, clause rewriting, like resolution, of the original term.

I have been interested in Nand-encoded terms for a long time, but gave it up after some sociologically induced headache. What is nice about those terms is that all basic propositional boolean terms in most algebras directly rewrite to Nand terms only growing a small constant factor. Moreover, a lot of 'normal' boolean axioms directly translate to a smaller set of axioms, there's even a proof that one axiom suffices on the boolean algebra for Nand; also, Nand encoded terms 'maximize' sharing, since terms which look superficially different given Or's and And's, may turn out to share a lot of structure.

At the heart of satsolving lies the P=NP question, satsolving is NP-complete, and it is commonly believed that P=NP is not true. I am one of the few people who actually believe it may be true, since there are no real reasons to assume otherwise, and, moreover, when it comes to basic questions, like minimal circuit size for boolean circuits, there are fundamental questions which are still unanswered and which should be solved. Designing algorithms feels like trying to reinvent Edmond's Blossom algorithm often, a partitioning scheme for graphs which was assumed to be hard for a long time until Edmond formulated an answer. [1]

What is nice about a satsolving Nand terms is that it is more easy to formulate heuristics directly on terms. Ultimately, the question is whether there exists a heuristic which would solve most, if not all, terms polynomially by exploiting the structure of the term. Since there are polynomial paths which rewrite a Nand term to minimal size, I found it interesting to study those for a while - or study those on particular structures. [2]

When looking at satsolvers based on CNF -guess I skimmed a few, cough, thousand articles in that direction- one finds heuristics which are often just 'computer friendly' array algorithms and reduce to specific strategies on ordinary Boolean circuits, and by obfuscuating the problem CNF satsolving sometimes lead to its own folkflore. Like believing that there are borders which encode phase transitions between easy and hard, like a clause/variable ratio of 4.2something, whereas it can be easily proven that simple and very hard terms can be encoded on that border with exactly that ratio. Even when it would hint at some result on random generated instances, I don't think it produces any fundamental insight.

My believe is that this research, even if there will always be a lot to be done to give better, faster, provers which are of industrial value, is 'stuck' fundamentally - i.e., it will not generate new, possibly fundamental, better provers which might turn out to be faster, or stronger, in the end. And may even prove P=NP, or otherwise.

[1] I suspect the believe is actually a rather pompous sociological construction. Since scientists are smart, and they didn't find an answer, it can't be true.
[2] For example, if PRIME is P, is there a 'loose informal' relation between the AKS algorithm, a propositional term, and a heuristic on it? Or, does/should it imply a polynomial heuristic?

A small joke should be inserted here.