Complexity Analysis of Mixed Constraint Satisfaction Problems

Xiaomei Shi^{*}

Show more

1. Introduction

Constraint satisfaction problem (CSP) is originated from artificial intelligence. It is a very important branch in the field of artificial intelligence, and it is a problem widely studied in computer science, information science, discrete mathematics and other interdisciplines. At present, a number of problems restricting the development of computer science, automatic control, system engineering and other disciplines can be modeled as CSPs. At the same time, CSPs are widely used in related application fields such as resource allocation, pattern recognition, temporal reasoning and image recognition.

CSP is composed of a set of variables and a set of constraints. Each variable takes a value from the corresponding non-empty domain. The number of elements in the domain can be fixed, or change with the number of variables. Each constraint has a corresponding incompatible assignment set to restrict the values of variables appearing in the constraint. The constraint set randomly selected constitutes a random instance of CSP. Given a random instance, if there exists an assignment to the variables satisfying all the constraints in the instance simultaneously, we call this assignment a solution. Usually, most of the CSPs are NP-complete problems. A lot of theoretical research and experimental results show that for many NP-complete problems we can define a control parameter (such as constraint density, constraint tightness, etc.). There is a critical value of the control parameter, below which the probability of an instance being satisfiable goes to 1, and above which it goes to 0 as the number of variables approaches infinity. People call this phenomenon the phase transition. The critical value is a satisfiability phase transition point.

Random k-SAT (k-satisfiability. Every clause of random k-SAT is randomly and independently generated. Each clause contains just k different variables) problem is a typical CSP with a fixed domain (The variable is Boolean, i.e. 0 and 1 or T and F), and the first NP-complete problem proved by Cook. Research shows that when, 2-SAT is in P class [1] [2] . We can get the satisfiability phase transition point (represents the ratio of the number of constraints and the number of variables). When, it belongs to NP-complete problem [3] . Although the phase transition phenomenon has been established, it is hard to get the accurate phase transition point rigorously. The currently known upper and lower bound of the phase transition region of the 3-SAT problem is 4.4898 [4] and 3.52 [5] , respectively. It is shown by Mertens et al. that the satisfiability phase transformation of 3-SAT problem occurs near 4.2667 [6] experimentally. In order to study the transition behavior between P and NP- complete problem, Monasson et al. [7] proposed the model -SAT mixing with 2-clause and 3-clause. Here, , and in m clauses of the model there are 2-clauses and 3-clauses. Obviously, when, it is a 2-SAT problem, and when, it is a 3-SAT problem. Later, It is rigorously proved by Achlioptas et al. [8] that when the model had transition characteristics similar to those of 2-SAT problem, and we could get the exact satisfiability phase transition point. When, the model has transition characteristics similar to those of 3-SAT problem, and we can’t get the exact phase transition point. This means that phase transition characteristics of model -SAT are determined by the ratio of 2-clauses and 3-clauses.

Model RB (Revised B) is a nontrivial random CSP with growing variable domain. Specifically, in order to overcome the trivial gradual unsolvability of model B in the standard model CSP [9] [10] (that is the binary model CSP), Xu et al. proposed a new CSP model, i.e. model RB [11] . This model is a modification of model B in terms of domain of variables and the number of constraints, which avoids model B’s disadvantage of failure to generate hard instances. In [11] , it has been strictly proved that phase transitions do exist for model RB as the number of variables approaches infinity. Moreover, the phase transition points are also known exactly. Later, Xu et al. theoretically proved that random instances generated by model RB in phase transition region had the exponential tree-like resolution complexity, which meant that these instances were almost all hard [12] . Later, an experiment also proved that near the phase transition threshold the hardness of finding solutions grows exponentially with the problem size [13] . Therefore, hard instances generated based on model RB are widely applied in all kinds of algorithm competitions such as CSP and SAT [14] as benchmarks, in order to test the performance of the algorithm. Since it was proposed, in a relatively short period of time model RB drew the attention of some scholar and was studied by them. Zhao et al. analyzed how the phase transition region of model RB became an exact phase transition point under the finite-size scaling effect, and gave the lower bound of scaling window [15] . Based on model RB, Shen et al. proposed a new model CSP, i.e. model d-k-CSP, and proved that the model also had the exact satisfiability phase transition phenomenon, and could generate a large number of hard instances [16] [17] [18] . In terms of the algorithm, Zhao et al. introduced the cavity method in statistical physics, and successively proposed two kinds of random CSP solved by the message passing algorithm guided by Belief Propagation and Reinforced Belief Propagation algorithm generated by model RB with a large domain [19] [20] [21] . Recently, Yuan et al. proposed RSA (Revised Simulated Annealing Algorithm) and GSA (Genetic-simulated Annealing Algorithm) to solve random instances of model RB [22] .

Inspired by model -SAT, Zhao et al. proposed a random CSP model mixed with constraints with different lengths (The constraint length is defined as the number of variables contained in the constraints), i.e. model RB^{mix}. Interestingly, the phase transition behavior of model -SAT is determined by the proportion of 2-clauses and 3-clauses, while the phase transition phenomenon of model RB^{mix} has nothing to do with the number of constraints with different lengths. It has been proved that model RB^{mix} mixed with constraints with different lengths has the exact satisfiability phase transition phenomenon similar to that of model RB, and the phase transition points can be located exactly. We know that if the new model CSP has the exact satisfiability phase transition phenomenon and can generate a large number of hard instances, the model has important significance for testing of CSP algorithm. For model RB^{mix}, we use the resolution method (its general form is for logic formula) to encode random instances generated by model RB^{mix} into the conjunctive normal form in SAT problem, i.e. CNF formula. By using five lemmas, we prove a famous theorem about the resolution length. Therefore, it is proved that the random instances generated by model RB^{mix} almost have no resolution complexity for which the resolution length is less than the exponential size. This shows that model RB^{mix} is hard for algorithms based on resolution, so model RB^{mix} can generate a large number of hard instances in the phase transition region. It is proved in this paper that almost all instances of model RB^{mix} have no tree-like resolution proofs of less than exponential size. Thus, we not only introduce new families of random CSP instances hard to solve, which can be useful in the experimental evaluation of CSP algorithms, but also propose a CSP model with both many hard instances and exact phase transitions.

2. Model RB^{mix}

The random instance of model RB^{mix} is composed of the variable set and the constraint set, where (is called the constraint density). Each variable takes values from its corresponding domain. Here, (is a constant), which means that the domain of variables grows polynomially with the variable number n. We divide m constraints into groups. Each group contains

constraints. Here,. The constraints in each group are

generated in the following way.

Step 1: -ary constrains are selected randomly and repeatably (). Each -ary constrain is composed of different variables selected randomly from n variable sets. Here, is called the constraint length.

Step 2: For each -ary constrain, assignments are selected randomly from assignments without repetition to form the corresponding incompatible assignment set and limit values of variables. Here, is called constraint tightness.

If random instances generated by model RB^{mix} have a group of assignments of n variables, so that all -ary constrains are satisfied at the same time, then the group of assignments is a solution. If the assignments of variables do not belong to an incompatible assignment set, this constraint is satiable. Otherwise, it is unsatisfiable.

Thus, model RB^{mix} is a promotion of model RB on the constraints. In model RB, all constraints are k-ary constraints, which means that the constraint length is fixed. Model RB^{mix} is composed of constraints with different lengths. Apparently, model RB is a special circumstance of model RB^{mix} when. Therefore, the constraint composition of model RB^{mix} is more generally representative.

Assume that is the probability of random instance satisfaction of model RB^{mix}. Let. Then we’ve proved that the following two theorems hold.

Theorem 1 Assume that. If and, then:

(1)

Theorem 2 Assume that. If and, then:

(2)

From the above two theorems, we can know that under the condition that the domain of variables is not too small (), as long as the constraint tightness is low enough or the constraint density is big enough (), when, with the continuous increase of r or p, The probability of a random instance generated by model RB^{mix} being satisfiable suddenly changes from 1 to 0, which means that model RB^{mix} has the satisfiability phase transition phenomenon, and it can get the accurate phase transition point. Therefore, model RB^{mix} is a mixed random CSP with accurate phase transition and a large domain.

3. Main Results

Assume that is a random instance generated by model RB^{mix}. We can get the following theorem.

Theorem 3 If, then almost has no resolution proof of the length less than. Here,.

As we know, for the unsatisfiable random instance I, there must be the shortest resolution proof which can reason out empty clauses. Its length is the lower bound of time consumed by any algorithm based on the resolution principle. Therefore, from Theorem 3 we can know that the solution algorithm of unsatisfiable instances of model RB^{mix} has the complexity of exponential size. Thus in the phase transition region model RB^{mix} can generate a large number of hard instances.

4. Proof of Theorem 3

This paper uses the resolution method to analyze the complexity of model RB^{mix}. The general form is CNF formula for SAT problem. The so-called CNF formula refers to the conjunction expression () of clause (disjunctive form of positive literal or negative literal). Therefore, we first must encode the random instance of model RB^{mix} according to a certain rules into CNF formula, then the resolution complexity of is the complexity of the corresponding CNF formula.

In model RB^{mix}, assume that the domain of each variable is. Define the new proposition variable (Boolean). If the value of is (), then. The following three types of clauses are required to encode into CNF formula.

(1) Clauses of the domain： Each variable is taken from its domain, i.e.;

(2) A value clause at most: Ensure that each variable takes a value from its domain at most at a time, i.e., and;

(3) Conflicted clause: It is used to remove the assignment in the uncoordinated set, so that the constraint is satisfied. For example, is an element in the uncoordinated assignment set limiting the value of and. Then the conflicted clause is.

Definition 1 (Length of the clause) The number of variables appearing in Clause is called length of, written as.

Definition 2 (Length of CNF) The maximum length of all the clauses in F of

CNF formula F is called the length of F, i.e..

Definition 3 (Resolution) The clause sequence () is called a resolution in which is derived from F. If and only if each belongs to F, or is derived from and ()in accordance with the rule and Þ (among them A and B are clauses, x is propositional variable), t is its length.

Definition 4 (Derived length) The length for is derived from F is the minimum length of, written as, i.e..

Definition 5 (Resolution) The resolution in which an empty clause (expressed with) is derived from F in CNF formula is called the resolution of F, written as, i.e.. In which,. Then is the resolution complexity of formula.

We consider the clause in as a node. If is derived from and, the edge is used to connect and, and. If the resulting graph is a tree, is called the tree-like resolution.

For the unsatisfiable formula F, there must be the shortest resolution proof deriving. Its length is the lower bound of time consumed by any algorithm based on the resolution principle.

Ben-Sasson and Wigderson gave the following theorem used to prove the lower bound of the resolution length of CNF formula [23] .

Theorem 4 Assume that F is a CNF formula with n variables, then we have:

. (3)

We want to use Theorem 4 to prove the important conclusion Theorem 3 in this paper. We need to use the following five lemmas.

Before giving Lemma 1, we first give the following definition.

Definition 6 (Subproblem) Assume that I is an instance of a CSP, then the instance composed of subsets of constraint set in I is called a subproblem of I.

Lemma 1 Assume that I is a random instane generated by model RB^{mix}, then, so that almost every subproblem with the size of in I has () constraints at most. In constraints, the length of constraints is, in which and. Here, the subproblem with the size of refers to variables contained in the subproblem of I.

Proof Assume that A = {I has at least constraints and a subproblem with the size of }. We prove that.

Let, then we have

(4)

For the sufficiently large n, here exists a constant, so that

. (5)

Let, then

. (6)

thus

. (7)

Therefore,.

Before giving Lemma 2, we first give the following definitions.

Definition 7 (i-constraint assignment group) Assume that i constraints contain variable x. If assignments are made for variables except x in i constraints, the such assignment group is called i-constraint assignment group, written as.

Definition 8 (flawed i-constraint assignment) The assignment of given variable x is (is the domain of x). If makes at least one constraint in these i constraints unsatisfiable, variable x’s assignment is called flawed for. Further, if any assignment of variable x is defective for, is called a defective i-constraint assignment group.

Lemma 2 Assume that I is a random instance generated by model RB^{mix}, then for defective almost may not exist in I. In which, in these i constraints the length of constraints is. Here,.

Proof Assume that A = {I has a defective i-constraint assignment group }. In which, , which means that it is proved that.

Next, let B = { is not flawed}, then we have

. (8)

= { is not flawed for, }. It’s worth noting that for model RB^{mix}，, so

(9)

Here,

. (10)

because

, (11)

And for, we have

. (12)

From in Lemma 1,

(13)

Thus,

. (14)

Because the maximum choices of i-constraint assignment group are as follows

And, so when n is sufficiently large, constant. Let

(15)

Thus, we have

(16)

So.

Lemma 3 Assume that I is a random instance generated by model RB^{mix}, then almost any subproblem with the size of of I can be satisfied.

Proof Assume that I has the unsatisfiable subproblem with the size of in I is, and the minimum unsatisfiable subproblem is. According to Lemma 1, contains constraints at most. So at least contains one variable x with the maximum degree of (the degree of variable is defined as the number of constraints containing the variable). After x and constraints containing x are removed, the remaining subproblem can be satisfied (is the smallest unsatisfiable subproblem). Thus, has the assignment group (), so that can be satisfied, but is unsatisfiable. In means that for cannot make satisfiable, so is defective. Therefore, for I has a flawed, which is contradictory with Lemma 2, so the assumption is not established. Thus, Lemma 3 is proved.

Before giving Lemma 4, we need to give the following definition.

Definition 9 (Complexity) Assume that I is a random CSP instance. Encode I into CNF formula F. Assume that F’s clause C is derived from sequence (). We can say that the number of variables contained in the smallest scale of subproblems deriving C is the complexity of the clause C, written as.

Lemma 4 Assume that I is a random instance generated by model RB^{mix}. Encode I into CNF formula F. Then almost any inversion of F has a clause which satisfies.

Proof From Lemma 3 we can know that. We establish a binary tree with the node of clause. In which, is the root node, the child nodes are two clauses with which we can reason out, until the leaf node. Obviously, the clause complexity of leaf node. From root node to leaf node, because the complexity of each clause does not increase, clause makes, and it can be derived that and, two clauses of (child nodes) satisfy and. From, In and a satisfies.

Lemma 5.

Proof Assume that I is a random instance generated by model RB^{mix}. Encode I into CNF formula F. Assume that is the smallest subproblem of, and. From Lemma 1 we can know that contains constraints at most. These constraints contain up to variables. So the maximum number of variables with the degree higher than is. Next, we will prove that these variables are in clause C.

Assume that variable x whose degree is not greater than not is not in clause C. Assume that is the remaing subproblem after x and constraints containing x are removed. Because is the smallest subproblem, clause C cannot be derived from the corresponding CNF formula of. So any assignment of variable x does not meet the corresponding CNF formula of, and is defective. This is contradictory with Lemma 2. So variables whose degree is not greater than are all in clause C.

So. The conclusion is established.

Finally, from Lemma 5 and Theorem 4, we can get Theorem 3. That is unsatisfiable random instances generated by model RB^{mix }have no resolution proof whose complexity is less than the exponential size.

5. Conclusion

In this paper, we analyze the resolution complexity of a mixed constraint satisfaction problem named model RB^{mix}. It is theoretically proved that unsatisfiable random instances generated by model RB^{mix }almost have no complexity whose resolution length is smaller than the exponential size. As a result, based on the resolution algorithm the random instance of model RB^{mix} has complexity of exponential size. The conclusion shows that model RB^{mix} with different length constraints not only has accurate satisfiability phase transition phenomenon, but also can generate a large number of hard instances in the phase transition region. Therefore, model RB^{mix} is a large-domain nontrivial random constraint satisfaction problem with mixed constraints and accurate phase transition. In the later work, we can also verify the hardness of model RB^{mix} in the experiment, and further study the solving algorithm.

References

[1] Verhoeven, Y. (2002) Random 2-SAT and Unsatisfiability. Information Processing Letters, 72, 119-123.

[2] Bollobás, B., Borgs, C., Chayes, J.T., Kim, J.H. and Wilson, D.B. (2001) The Scaling Window of the 2-SAT Transition. Random Structures and Algorithms, 18, 201-256.

https://doi.org/10.1002/rsa.1006

[3] Cook, S. (1971) The Complexity of Theorem-Proving Procedures. Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, Shaker Heights, OH, 3-5 May 1971, 151-158.

https://doi.org/10.1145/800157.805047

[4] Díaz, J., Kirousis, L., Mitsche, D., et al. (2009) On the Satisfiability Threshold of Formulas with Three Literals per Clause. Theoretical Computer Science, 410, 2920- 2934.

[5] Kaporis, A.C., Kirousis, L.M. and Lalas, E.G. (2002) The Probabilistic Analysis of a Greedy Satisfiability Algorithm. In: Mohring, R. and Raman, R., Eds., Algorithms— ESA 2002, Lecture Notes in Computer Science, Vol. 2461,. Springer, Berlin, Heidelberg, 574-586.

https://doi.org/10.1007/3-540-45749-6_51

[6] Mertens, S., Mézard, M. and Zecchina, R. (2006) Threshold Values of Random K-SAT from the Cavity Method. Random Structures & Algorithms, 28, 340-373.

https://doi.org/10.1002/rsa.20090

[7] Monasson, R., Zecchina, R., Kirkpatrick, S., et al. (1996) Phase Transition and Search Cost in the (2 + p)-SAT Problem. Proceedings of the 4th Workshop on Physics and Computation, Boston, 229-232.

[8] Achlioptas, D., Kirousis, L.M., Kranakis, E., et al. (2001) Rigorous Results for Random (2 + p)-SAT. Theoretical Computer Science, 265, 109-129.

[9] Gent, I.P., Macintyre, E., Prosser, P., Smith, B.M. and Walsh, T. (2001) Random Constraint Satisfaction: Flaws and Structure. Constraints, 6, 345-372.

https://doi.org/10.1023/A:1011454308633

[10] Smith, B.M. and Dyer, M.E. (1996) Locating the Phase Transition in Binary Constraint Satisfaction Problems. Artificial Intelligence, 81, 155-181.

[11] Xu, K. and Li, W. (2000) Exact Phase Transitions in Random Constraint Satisfaction Problems. Journal of Artificial Intelligence Research, 12, 93-103.

[12] Xu, K. and Li, W. (2006) Many Hard Examples in Exact Phase Transitions. Theoretical Computer Science, 355, 291-302.

[13] Xu, K., Boussemart, F., Hemery, F., et al. (2000) Random Constraint Satisfaction Problems. Journal of Artificial Intelligence Research, 12, 93-103.

[14] Cai, S., Su, K. and Chen, Q. (2010) EWLS: A New Local Search for Minimum Vertex Cover. In: Fox, M. and Poole, D., Eds., Program Cochairs: Proceedings of the Twenty-fourth AAAI Conference on Artificial Intelligence (AAAI-10), The AAAI Press, Menlo Park, 45-50.

[15] Zhao, C. and Zheng, Z. (2011) Threshold Behaviors of a Random Constraint Satisfaction Problem with Exact Phase Transitions. Information Processing Letters, 111, 985-988.

[16] Shen, J. (2011) Model Structure and Phase Transition Phenomenon of Constraint Satisfaction Problem. Doctoral Dissertation, Central China Normal University, Wuhan.

[17] Fan, Y. and Shen, J. (2011) On the Phase Transitions of Random k-Constraint Satisfaction Problems. Artificial Intelligence, 175, 914-927.

[18] Shen, J. and Ren, Y. (2016) Bounding the Scaling Window of Random Constraint Satisfaction Problems. Journal of Combinatorial Optimization, 31, 786-801.

https://doi.org/10.1007/s10878-014-9789-y

[19] Zhao, C., Zhou, H., Zheng, Z., et al. (2011) A Message-Passing Approach to Random Constraint Satisfaction Problems with Growing Domains. Journal of Statistical Mechanics: Theory & Experiment, 2011, P02019.

https://doi.org/10.1088/1742-5468/2011/02/P02019

[20] Zhao, C. and Zheng, Z. (2012) A Belief Propagation Algorithm of Solving the Constraint Satisfaction Problem Based on the Variable Entropy. Chinese Science: Information Science, 42, 1170-1180.

[21] Zhao, C., Zhang, P., Zheng, Z., et al. (2012) Analytical and Belief Propagation Studies of Random Constraint Satisfaction Problems with Growing Domains. Physical Review E, 85, 016-106.

[22] Yuan, Z. and Zhao, C. (2017) Solve Large-Domain Constraint Satisfaction Problem with Two Kinds of Revised Simulated Annealing Algorithm. Application Research of Computers, 12.

[23] Ben-Sasson, E. and Wigderson, A. (2001) Short Proofs Are Narrow-Resolution Made Simple. Journal of the ACM, 48, 149-169.

https://doi.org/10.1145/375827.375835