Seating Arrangements: Discussion Date: Fri, 16 Feb 2001 From: Vladimir Lifschitz To: TAG Two months ago, I asked you to use your favorite answer set solver to solve the following problem: You are organizing a New Year's Eve party. There will be N tables in the room, with M chairs around each table. You need to select a table for each of the guests, who are assigned numbers from 1 to MN, so that two conditions are satisfied. First, some guests like each other and want to sit together; accordingly, you are given a set A of two-element subsets of {1,...,MN}, and, for every {i,j} in A, guests i and j should be assigned the same table. Second, some guests dislike each other and want to sit at different tables; accordingly, you are given a set B of two-element subsets of {1,...,MN}, and, for every {i,j} in B, guests i and j should be assigned different tables. Your program should find such a seating arrangement or determine that it is impossible. The solutions that I have received are available now at http://www.cs.utexas.edu/tag/seating_solutions . It would be interesting to compare them and to discuss what we can learn from this little example about the methodology of answer set programming. Most of the programs are written for smodels, describe seating arrangements using a binary relation between a guest and a table, and represent assigning m guests to every table by a rule containing a cardinality constraint in the head, such as m {at(G,T) : guest(G)} m :- table(T). Several solutions specify all other conditions on a seating arrangement by rules with the empty heads. This is done in Programs 7, 12, 13, 15, 19, 20. Program 15 on this list is somewhat different from the others: it weeds out the arrangements in which one of the guests is not seated; the others weed out the arrangements in which one of the guests gets more than one seat. This last condition can be expressed also by a rule with a cardinality constraint in the head, such as 1{at(G,T) : table(T)}1 :- guest(G). This is done in Programs 1, 3, 8, 10, 21 and 22. Programs 11 and 16 use rules with the empty heads and cardinality constraints in the bodies, such as :- m+1 {at(G,T) : guest(G)}, table(T). to eliminate the arrangements in which the number of guests assigned to a table is different from m. Programs 14, 17 and 18 ensure that the number of guests assigned to every table equals m by introducing an enumeration of chairs, separately for each table. These programs don't need m in cardinality constraints. One of them (No. 17) talks about chairs by introducing a second binary predicate similar to at(G,T) -- a relation between a guest and the number of his chair; in the other two programs, at(G,T) is turned into a ternary predicate by adding the chair argument. In Program 4, the predicate at(G,T) is replaced by the equivalence relation between guests, sameTable(G1,G2). Neither chair numbers nor table numbers need to be introduced when such a predicate is available. Now a few words about the programs that use systems other than smodels. Program 2 is written in PAL (Pertinence Action Language) presented at NMR-2000. It treats the problem dynamically, as the process of assigning seats to guests one by one. It uses a feature of the language that is not yet implemented. Program 5 is written for SLDNFA, a system that integrates abductive logic programming and constraint logic programming. That system was demonstrated at NMR-2000 as well. In the syntax of SLDNA, the declaration open_function at_table: person -> table corrsponds to what an smodels programmer would write as 1{at_table(G,T) : table(T)}1 :- person(G). Expressions of the form fol F correspond, roughly, to constraints :- not F . Program 6 is similar to Program 17 in the sense that it describes a seating arrangement by two binary predicates--one relating guests to tables, the other to chairs. But it is written for ccalc, rather than smodels, using the fact that ccalc can generate answer sets for a tight program. There is an interesting twist in this use of ccalc as an answer set solver: here, for the first time, this method is applied to rules with cardinality constraints. Consider the rule 1{at(G,T) : table(T)}1 :- guest(G). It is equivalent to the rule {at(G,T)} :- guest(G), table(T). (1) combined with some rules with the empty heads that express conditions on cardinalities. The { } construct of smodels can be equivalently replaced now with nested occurrences of negation as failure: at(G,T) :- not not at(G,T), guest(G), table(T). (2) Indeed, both (1) and (2) allow us to decide independently about each of the atoms at(G,T) whether or not to include it in the answer set. The program obtained from Program 17 by eliminating the { } construct in this way turns out to be tight in a certain generalized sense (work in progress). And it so happens that the completion mechanism of ccalc correctly computes answer sets for the programs that are tight in this more general sense. Finally, Program 9 is written in the language of SEM -- a system for enumerating finite models of first-order many-sorted theories, described in Proc. IJCAI-95. We know one of the creators of that system, Hantao Zhang, as the author of the propositional solver sato. I look forward to your comments on the problem and solutions. ============= Date: Mon, 19 Feb 2001 From: Michael Gelfond To: Vladimir Lifschitz Thank you very much for providing us with an opportunity for an interesting discussion. You received a truly impressive list of solutions. It will take some time to study this and I am looking forward to the exercise. Let me attempt to start the general discussion with an easy task of commenting on my own solution (13). This is what it looks like: SELECT a candidate solution - m people to seat at each table 1. m{at(P,T) : person(P)}m :- table(T). People who dislike each other should seat at different tables 2. :- table(T), person(P1), person(P2), at(P1,T), at(P2,T), dislike(P1,P2). People who like each other should seat at the same table 3. :- table(T), person(P1), person(P2), at(P1,T), not at(P2,T), like(P1,P2). A person is seated at at most one table. 4. :- table(T1), table(T2), neq(T1,T2), person(P), at(P,T1), at(P,T2). SO FAR IT IS PRACTICALLY IDENTICAL TO SOLUTIONS 7 by Selim and 20 by Richard and very similar to several others, including those by Chitta, Veena, and Semra. THE DIFFERENCE is in the next two rules which seem not to be used by anyone else: 5. like(X,Y) :- like(Y,X), person(X), person(Y). 6. dislike(X,Y) :- dislike(Y,X), person(X), person(Y). Actually, I had one more rule, 7. :- person(X), person(Y), likes(X,Y), dislikes(X,Y). but it somehow disappeared. Obviously, answer sets of program (1)--(4) contain all the solutions of our problem. So why did I add rules (5)--(7)? For me these rules constitute a commonsense knowledge about meaning of the words 'like each other' and 'dislike each other' and hence are 'automatically' included in the program. I guess that this automatic inclusion (I actually started with them) reflects my 'knowledge representation centered' style of programming. Rule 7 is an example of defensiveness in A-Prolog programming. It checks correctness of the 'input' database. The interesting question is how addition of these extra rules will effect performance of the program? I didn't really study this but my hope was that the extra information will be useful for early discovery of wrong choices made by rule 1. For a few examples I actually run this proved to be the case. For instance, below is the SMODELS data for the database from (13) Duration: 15.210 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 21131 Number of rules: 229172 Number of picked atoms: 67534 Number of forced atoms: 0 Number of truth assignments: 326168 Size of searchspace (removed): 1000 (0) As you can see the number of rules is rather big but overall timing is not that bad. If rules (5), (6) were omitted we would need much more time. (Actually I gave up after waiting for about 4 minutes) So here is an interesting question - can we really prove (or disprove) the computational advantage of having rules (5),(6) experimentally? What methodology is appropriate for this? Any comments? ============= Date: Mon, 19 Feb 2001 From: Tuan Chi Le To: Michael Gelfond My initial thoughts on this is adding commonsense knowledge like adding control knowledge into programs that has been pioneered by Bacchus & Kabanza. The performance is good or bad depending on whether the control knowledge is good or bad. We have shown the control knowledge significantly improved the planning efficiency in our paper submitted to Spring Symposium on Answer set Programming. The other question with control knowledge is that, it's sound but can it bring completeness ??? To the best of my knowledge, Reiter also has a paper in AAAI00 that uses control knowledge, but both Bacchus & Reiter did not show whether the program still has completeness when adding control knowledge. ============= Date: Tue, 20 Feb 2001 From: Michael Gelfond To: Tuan Chi Le Thanks for a prompt reply. Here is a very long reaction to it addressed as usual to everyone. Tuan: My initial thoughts on this is adding commonsense knowledge like adding control knowledge into programs that has been pioneered by Bacchus & Kabanza. Michael: First a clarification: when I was writing my program efficiency was not an issue at all. I just wanted to write something simple, natural, and 'obviously' correct. So my attention was at relations I was defining (like, dislike, and at) and not at SMODELS (or any other) algorithm. I think that this makes my rules for 'like' and 'dislike' different from control knowledge. There are of course very close similarities too and sometimes the boundary is gray. Back to your comment: Tuan: The performance is good or bad depending on whether the control knowledge is good or bad. Michael: But what does it mean? How can we establish if given control knowledge is good or bad? Tuan: We have shown the control knowledge significantly improved the planning efficiency in our paper submitted to Spring Symposium on Answer set Programming. Michael: I am looking forward to listening to your talk at Stanford. I tried to show similar thing for a particular control knowledge but not yet satisfied with the results. This is one of the (many) reasons I am interested in this discussion. So let us go back to my example and let (for the sake of discussion) view my rules for 'like' and 'dislike' as control knowledge. I mentioned that I run several examples and that addition of these rules substantially improved the running time. (Actually, initially I run one example which is included in the original message. I didn't care about efficiency - just wanted to make sure the program compiles, etc, After one of my students complained that her program takes too much time on my input I became curious about the difference and run a few more examples.) Now, did I prove that my control knowledge is 'good'? Of course not. The only thing I proved is that for some inputs the extra rules help. This does not necessarily make my control knowledge good. For some inputs insertion sort works faster than quicksort which does not make it a better algorithm. Now suppose that I want to really do some experimental computer science and experimentally establish that my rules are good. I need some 'objective' suite of examples and some measure of goodness. So HOW WOULD ONE COME UP WITH THIS SUITE and THIS MEASURE? Did you do it in your paper? If so can you apply the same methodology here? Anyone has thoughts on this subject? Of course experiments are not the only possible way to evaluate efficiency of algorithms. For procedural programming this is often done with the help of complexity theory which can be used to compare different algorithms. Can we apply it here? Well, we can ask if there is a polynomial algorithm for our problem, is it NP hard, complete, etc. If a good algorithm is found can we program it in SMODELS? (We can efficiently program some procedural algorithms in Prolog.) Or here is another possible approach: We can view our declarative programs as inputs to the algorithm of, say, SMODELS. Let us say that consistent program P2 is a refinement of a program P1 if every answer set of P2 is an answer set of P1 (actually, this is too strong a condition, since we are only interested in atoms belonging to both, P1 and P2, but I am just giving a hint here). If answer sets of P1 are solutions of some problem P then so are answer sets of P2. Of course, SMODELS algorithm can take a very different time when run on P1 and P2. Say it takes T1 on P1 and T2 on P2. Now we can ask ourselves how to predict if T2 < T1? To do that we need to understand what Ti depends on. Size of the program? Not necessarily. Addition of control rules always increases the size but seem to often help efficiency. The number of answer sets? Not necessarily. There are many examples. The ratio between 'candidates' generated by 'the generate' part of the program to the number of program's answer sets? Maybe, but I doubt it since this does not take heuristics of SMODELS into account. Should we first ignore heuristic when we compare complexity or heuristic is really the 'main' part of the algorithm? There are many questions like this. ANY THOUGHT ON THE SUBJECT? Tuan: The other question with control knowledge is that, it's sound but can it bring completeness ??? To the best of my knowledge, Reiter also has a paper in AAAI00 that uses control knowledge, but both Bacchus & Reiter did not show whether the program still has completeness when adding control knowledge. Michael: What do you mean by completeness? One way to define it is to require P2 = P1 + control knowledge be a refinement of P1. This can be easily shown for my program. Or you can require P1 and P2 have the same answer sets, or have the same answer sets satisfying certain property, etc. For instance consider program P1 used to find all plans of the length N for the blocks world and P2 = P1 + 'good tower heuristics'. It is not difficult to show that for any goal G there is one to one correspondence between answer set of 'P2 + :-not G' (I am simplifying here) and plans to achieve G without destroying any good towers. It is obvious that if there is an N step plan to achieve goal G then there is an N step plan to achieve G wich does not destroy a good tower. Hence we have that P2 is a refinement of P1 complete w.r.t. plans not destroying good towers. Are you sure that something similar is not done by Bacchus? O.K., this was a long comment. Let's see if someone is listening. ============= Date: Tue, 20 Feb 2001 From: Wolfgang Faber To: Michael Gelfond On Mon, Feb 19, 2001 at 03:04:20PM -0600, Michael Gelfond wrote: > 5. like(X,Y) :- like(Y,X), person(X), person(Y). > > 6. dislike(X,Y) :- dislike(Y,X), person(X), person(Y). > > Actually, I had one more rule, > > 7. :- person(X), person(Y), > likes(X,Y), > dislikes(X,Y). > > but it somehow disappeared. > > Obviously, answer sets of program (1)--(4) contain all > the solutions of our problem. So why did I add rules (5)--(7)? > > For me these rules constitute a commonsense knowledge > about meaning of the words 'like each other' and 'dislike each other' > and hence are 'automatically' included in the program. > I guess that this automatic inclusion (I actually started > with them) reflects my 'knowledge > representation centered' style of programming. > Rule 7 is an example of defensiveness in A-Prolog programming. > It checks correctness of the 'input' database. > > The interesting question is how addition of these extra rules > will effect performance of the program? In my opinion, 7 should not affect performance very much, since it contains only input predicates or, if you have also 5 and 6, "easy to compute" predicates. "Easy to compute" could refer to predicates having a stratified definition. Systems should be able to fully determine the truthvalues of all ground instances of these predicates during grounding. So also the satisfaction of all ground instances of the above constraint can be determined during the grounding phase, so either it is known that there will be no answer set at all (if some ground instance of the constraint is not satisfied) or all answer sets satisfy all of the ground instances of the constraint. In the latter case, these ground instances need not be looked at in the subsequent computation. DLV does just that. I didn't check with smodels - however I suspect that in the presence of 5 and 6 like and dislike cannot be "domain predicates" for lparse any longer, and "like" and "dislike" are therefore no "easy predicates" for lparse, in the sense that it cannot fully determine the truth values of their instantiations. But even in this case the overhead for the subsequent smodels computation should be rather small. 5 and 6 are different, they symmetrically close like and dislike. So unless the input is already symmetric, you add knowledge. But as far as I can see, your program does not rely on the fact that like and dislike are symmetric. First I thought that 3 does, but it doesn't [if like(1,2) holds but like(2,1) does not, and at(2,5) holds and at(1,5) does not, there is always some other table t in which at(1,t) holds and at(2,t) does not]. I'm surprised to read that performance is affected quite a bit by this symmetry addition. I believe that it is due to rule 3; if we consider again the case in which we have like(1,2) but not like(2,1), and we decided at(2,5) but nothing about person 1 - not at(1,5) cannot be inferred in this case but it could if we have like(2,1). I hope that I'm not completely wrong. ============= Date: Tue, 20 Feb 2001 From: Michael Gelfond To: Wolfgang Faber Thanks for the response. You are certainly right. But here is a more detailed answer. W. In my opinion, 7 should not affect performance very much M. Agree, unless of course the input is bad. In this case improvement should be substantial. W. 5 and 6 involve "easy to compute" predicates. "Easy to compute" could refer to predicates having a stratified definition. Systems should be able to fully determine the truthvalues of all ground instances of these predicates during grounding. M. I agree. W. So also the satisfaction of all ground instances of the above constraint can be determined during the grounding phase, so either it is known that there will be no answer set at all (if some ground instance of the constraint is not satisfied) or all answer sets satisfy all of the ground instances of the constraint. In the latter case, these ground instances need not be looked at in the subsequent computation. M. Agreed W. DLV does just that. I didn't check with smodels - however I suspect that in the presence of 5 and 6 like and dislike cannot be "domain predicates" for lparse any longer, and "like" and "dislike" are therefore no "easy predicates" for lparse, in the sense that it cannot fully determine the truth values of their instantiations. M. You are right. Rules (5)-(7) are not eliminated by lparse. W. But even in this case the overhead for the subsequent smodels computation should be rather small. M. It seems to be true for my examples. W. 5 and 6 are different, they symmetrically close like and dislike. So unless the input is already symmetric, you add knowledge. M. Right. W. But as far as I can see, your program does not rely on the fact that like and dislike are symmetric. M. Well, the semantics does not. (as far as at is concerned they have the same answer sets. But operationally these rules matter. W. First I thought that 3 does, but it doesn't [if like(1,2) holds but like(2,1) does not, and at(2,5) holds and at(1,5) does not, there is always some other table t in which at(1,t) holds and at(2,t) does not]. M. You are right. Answer sets are the same. W. I'm surprised to read that performance is affected quite a bit by this symmetry addition. I believe that it is due to rule 3; if we consider again the case in which we have like(1,2) but not like(2,1), and we decided at(2,5) but nothing about person 1 - not at(1,5) cannot be inferred in this case but it could if we have like(2,1). I hope that I'm not completely wrong. M. You are absolutely right. In fact if you replace rule three by :- table(T1), table(T2), person(P1), person(P2), at(P1,T1), at(P2,T2), like(P1,P2), neq(T1,T2). then I believe rules 5, 6 will have little impact (if any). (BTW, the resulting program will be very close to 12, 15 and 19). Wolfgang, how is it for DLV? Does adding 5 and 6 improve efficiency? It will be interesting to run my program on both, SMODELS and DLV. The program size for DLV should be much smaller and it may add to the DLV performance. ============= Date: Tue, 20 Feb 2001 From: Esra Erdem and Vladimir Lifschitz To: Michael Gelfond Thanks for the explanation and the interesting questions, as always. It seems like Jonathan also considers symmetry in like/2 and in dislike/2 (Program 16). He adds some constraints symmetric to 2 and 3. This suggested to us the following modification of your program. In Program 13, we replaced your symmetry rules 5. like(X,Y) :- like(Y,X), person(X), person(Y). 6. dislike(X,Y) :- dislike(Y,X), person(X), person(Y). with 5'. like0(X,Y) :- like(X,Y), person(X), person(Y). like0(X,Y) :- like(Y,X), person(X), person(Y). 6'. dislike0(X,Y) :- dislike(X,Y), person(X), person(Y). dislike0(X,Y) :- dislike(Y,X), person(X), person(Y). replaced `like' in 3 with `like0', and replaced `dislike' in 2 with `dislike0'. It seems like this modification reduces the computation time of smodels on your example. Here are the statistics. With the original program, we got Duration: 22.490 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 21021 Number of rules: 229062 Number of picked atoms: 66992 Number of forced atoms: 0 Number of truth assignments: 323864 Size of searchspace (removed): 1000 (0) and, with the modified program, we got Duration: 5.230 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 1021 Number of rules: 9290 Number of picked atoms: 66992 Number of forced atoms: 0 Number of truth assignments: 323840 Size of searchspace (removed): 1000 (0) This modification reduces the size of the program in that like/2 and dislike/2 become domain predicates in the modified program. ============= Date: Wed, 21 Feb 2001 From: Esra Erdem and Vladimir Lifschitz To: Michael Gelfond In our previous message we gave smodels' statistics on your program with a modification of the symmetry rules. We also tried Jonathan's program (Program 16) with your problem. Here are the statistics: Duration: 0.920 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 1211 Number of rules: 1720 Number of picked atoms: 78591 Number of forced atoms: 1 Number of truth assignments: 401226 Size of searchspace (removed): 1000 (0) ============= Date: Wed, 21 Feb 2001 From: Veena Mellarkod and Monica Nogueira To: Esra Erdem and Vladimir Lifschitz We looked at program (16) by Jonathan: % Rule (1) each guest must be at exactly one table 1 {at(G,T) : table(T)} 1 :- guest(G). % Rule (2a - 2b) guests related by a() must not be at different tables :- at(G1,T1), at(G2,T2), a(G1,G2), guest(G1), guest(G2), table(T1), table(T2), T1 != T2. :- at(G1,T1), at(G2,T2), a(G2,G1), guest(G1), guest(G2), table(T1), table(T2), T1 != T2. % Rule (3a - 3b) guests related by b() must not be at the same table :- at(G1,T), at(G2,T), b(G1,G2), guest(G1), guest(G2), table(T). :- at(G1,T), at(G2,T), b(G2,G1), guest(G1), guest(G2), table(T). % Rule (4) no table can have more than m guests :- m+1 {at(G,T) : guest(G)}, table(T). Let's call this program P1. We noticed that program P1 and program P2, obtained from P1 by removing rules (2b) and (3b), are equivalent. In fact, rules 2a (3a) and 2b (3b) are identical up to reordering of conjuncts and renaming of variables. So symmetry plays no role in Jonathan's program. (By the way, removing these rules even made the program slightly faster.) We were also curious if the efficiency of the program could be improved using the idea of symmetry (Michael's rules). Let P3 be obtained from P2 by adding symmetry rules and using negation as failure as follows: % each guest must be at exactly one table 1 {at(G,T) : table(T)} 1 :- guest(G). a0(X,Y) :- a(Y,X), guest(X), guest(Y). a0(X,Y) :- a(X,Y), guest(X), guest(Y). b0(X,Y) :- b(Y,X), guest(X), guest(Y). b0(X,Y) :- b(X,Y), guest(X), guest(Y). % guests related by a() must not be at different tables :- at(G1,T1), not at(G2,T1), a0(G1,G2), guest(G1), guest(G2), table(T1). % guests related by b() must not be at the same table :- at(G1,T), at(G2,T), b0(G1,G2), guest(G1), guest(G2), table(T). % no table can have more than m guests :- m+1 {at(G,T) : guest(G)}, table(T). This is the timing for P3 using Michael's program input, against 0.68 secs for P1, 0.62 secs for P2. Duration: 0.590 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 1381 Number of rules: 1090 Number of picked atoms: 72024 Number of forced atoms: 0 Number of truth assignments: 357380 Size of searchspace (removed): 1000 (0) Difference in timings increase for larger instances of the problem. ============= Date: Thu, 22 Feb 2001 From: Vladimir Lifschitz To: Veena Mellarkod and Monica Nogueira Yes, your observations do show that symmetry plays no role in Program 16. Since the differences between the computation times of P1, P2 and P3 (0.68, 0.62, 0.59) are small, and Program P2 is simpler than the others, I would say that P2 looks particularly attractive. ============= Date: Thu, 22 Feb 2001 From: Marcello Balduccini, Monica Nogueira and Veena Mellarkod To: Vladimir Lifschitz We changed Michael's original program in two ways: a) eliminated recursion from the symmetry rules as indicated by you and Esra, b) substituted constraint (4) by a "choice constraint" similar to the constraint used in Jonathan's (16) and Chitta's (11) programs. Let's call this new program P4 (see below). It is interesting that the use of "choice constraint" rules and the symmetry rules plus negation as failure in rule (3) improved performance. The new timing is Duration: 0.530 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 1121 Number of rules: 490 Number of picked atoms: 66992 Number of forced atoms: 0 Number of truth assignments: 323940 Size of searchspace (removed): 1000 (0) against program P2 Duration: 0.630 Number of choice points: 92 Number of wrong choices: 0 Number of atoms: 1211 Number of rules: 1120 Number of picked atoms: 78591 Number of forced atoms: 1 Number of truth assignments: 401226 Size of searchspace (removed): 1000 (0) %******************************************** %*** PROGRAM 13(B), for smodels %*** By Michael Gelfond (mgelfond@cs.ttu.edu) %******************************************** %%%%%%%%% INPUT const n=10. % number of tables const m=10. % number of chairs const k=n*m. % number of people table(1..n). person(1..k). % The input database. We assume the data to be correct. like(1,2). like(3,4). like(5,6). like(10,13). like(10,11). like(4,5). dislike(1,5). dislike(1,6). dislike(12,16). dislike(13,14). dislike(7,8). dislike(7,9). % Symmetry Rules like0(X,Y) :- like(X,Y), person(X), person(Y). like0(X,Y) :- like(Y,X), person(X), person(Y). dislike0(X,Y) :- dislike(X,Y), person(X), person(Y). dislike0(X,Y) :- dislike(Y,X), person(X), person(Y). %%%%%%%%%%% SOLUTION % Rule (1) - SELECT a candidate solution - m people to seat at each table m{at(P,T) : person(P)}m :- table(T). % Rule (2) - People who dislike each other should seat at different tables :- table(T), person(P1), person(P2), at(P1,T), at(P2,T), dislike0(P1,P2). % Rule (3) - People who like each other should seat at the same table :- table(T), person(P1), person(P2), at(P1,T), not at(P2,T), like0(P1,P2). % Rule (4) - A person is seated at at most one table. :- 2 {at(P,T):table(T)}, person(P). hide. show at(P,T). ============= Date: Thu, 22 Feb 2001 From: Vladimir Lifschitz To: Marcello Balduccini, Monica Nogueira and Veena Mellarkod Your example seems to show that merely replacing :- table(T1), table(T2), neq(T1,T2), person(P), at(P,T1), at(P,T2). with :- 2 {at(P,T):table(T)}, person(P). has a significant effect on the computation time. It's remarkable how efficiently cardinality constraints are implemented in smodels. ============= Date: Fri, 23 Feb 2001 From: Jonathan Campbell To: Marcello Balduccini, Monica Nogueira and Veena Mellarkod I've been experimenting a little with programs P3 and P4, and have discovered a curious difference in their performance. The two programs are essentially the same except for the generate step and one of the constraints. P3 has the rules: 1 {at(G,T) : table(T)} 1 :- guest(G). :- m+1 {at(G,T) : guest(G)}, table(T). while P4 has: m{at(P,T) : person(P)}m :- table(T). :- 2 {at(P,T):table(T)}, person(P). So P3 controls the number of tables per guest in the generation step and the number of guests per table with a constraint, while P4 does the opposite. I compared these two methods using the two programs included below -- they share all the same code except for two pairs of lines like the ones above. I tested the two programs on a set of randomly generated problems with 25 guests, and got the following average times: "P3" "P4" Solvable: .027 .024 Unsolvable: .023 56.026 So "P3" is slightly slower when there is an answer set, but almost 2500 times faster when there are no answer sets! The difference for unsolvable problems seems to increase rapidly as the problem size increases. I don't have a good explanation for this, just a vague feeling that "P3" does less generation and more constraint than "P4". Do you have any suggestions? SHARED CODE: ------------ % input const chairs = 5. const tables = 5. const guests = chairs * tables. likes(2,7). dislikes(4,23). (etc.) % domain predicates table(1..tables). guest(1..guests). % symmetry rules likes0(G1,G2) :- likes(G1,G2), guest(G1), guest(G2). likes0(G1,G2) :- likes(G2,G1), guest(G1), guest(G2). dislikes0(G1,G2) :- dislikes(G1,G2), guest(G1), guest(G2). dislikes0(G1,G2) :- dislikes(G2,G1), guest(G1), guest(G2). % guests who like each other must be at the same table :- at(G1,T), not at(G2,T), likes0(G1,G2), guest(G1), guest(G2), table(T). % guests who dislike each other must not be at the same table :- at(G1,T), at(G2,T), dislikes0(G1,G2), guest(G1), guest(G2), table(T). % display hide. show at(G,T). CODE FOR "P3" ONLY: ------------------- % each guest must be at exactly one table 1 {at(G,T) : table(T)} 1 :- guest(G). % no table can have more guests than there are chairs :- chairs+1 {at(G,T) : guest(G)}, table(T). CODE FOR "P4" ONLY: ------------------- % each table must have exactly as many guests as chairs chairs {at(G,T) : guest(G)} chairs :- table(T). % no guest can be at more than one table :- 2 {at(G,T) : table(T)}, guest(G). ============= Date: Fri, 23 Feb 2001 From: Marcello Balduccini, Monica Nogueira and Veena Mellarkod To: Jonathan Campbell Thank you for sharing your results with us. We are trying to replicate your experiments in order to analyse them. To this extent, we would appreciate if you could describe the algorithm that you used for generating random instances, or if you could publish the instances in case they were manually generated. We are also interested in knowing how many instances you used, and in having an idea of the variance of the experimental times. ============= Date: Sun, 25 Feb 2001 From: Marcello Balduccini, Monica Nogueira and Veena Mellarkod To: Jonathan Campbell Initially, we also believed that the difference between the computation times for P3 and P4 was due to the high number of permutations generated by P4 against much less by P3. We computed the number of permutations for both programs and plotted it. However, further analysis showed that it is not the case. In fact we ran some experiments using the following unsolvable input likes(1,2) dislikes(1,2) for different numbers of chairs and tables. In every case there is a striking difference between the number of choice points for the programs. For example, in an experiment with chairs=5 and tables=4 we got zero choice points for P3 and 34355009 for P4. Since the graph suggests that the number of permutations for P4 is the square of the number of permutations for P3, we believe that the number of choice points gives a much better account for the exponential growth of computational times. After analysing better the programs, we realized that there is an extra constraint implicit in P4 that is stated explicitly in P3: "every guest must be seated" (it is part of the choice rule: "each guest must be at exactly one table"). It turns out that if we make this knowledge explicit also in P4 we get the same performance as P3 (and the same number of choice points). The cardinality constraint that was added to P4 is: :- {at(G,T):table(T)} 0, guest(G). This constraint helps to detect the conflict very early in the Smodels computation: it happens even before any literal is selected by the heuristic function (that's why the number of choice points is zero). ============= Date: Sun, 25 Feb 2001 From: Jonathan Campbell To: Marcello Balduccini, Monica Nogueira and Veena Mellarkod Adding that extra constraint to P4 does indeed seem to eliminate the huge disparity between the two programs on unsolvable problems. It looks like the modified P4 (P5?) runs slightly slower than P3 on both solvable and unsolvable programs -- around 20% slower on instances with 10 tables and 10 chairs per table -- but that seems to be because it's generating more rules than P3, since the number of choice points is about the same. That's just anecdotal evidence from a few test runs (about 40). I'll experiment with them a little more and let you know if anything interesting turns up. To generate random test instances, I just wrote a little program that takes four arguments: the number of tables, the number of chairs per table, and two probabilities p and q. For each pair of guests G1 and G2, it generates the rule likes(G1,G2) with probability p; dislikes(G1,G2) with probability q; and neither with probability 1-p-q. It won't specifically generate solvable or unsolvable instances; I just ran it enough times to get sufficient data on both. ============= Date: Mon, 26 Feb 2001 From: Vladimir Lifschitz To: Marcello Balduccini, Monica Nogueira and Veena Mellarkod Your view that the computation time of smodels doesn't depend much on the number of potential solutions generated by the program agrees with what we noted earlier in connection with other uses of smodels. By the way, you observe that, according to your graph, the number of potential solutions generated by the rule m {at(G,T) : guest(G)} m :- table(T). (1) is approximately the square of the number of potential solutions generated by 1 {at(G,T) : table(T)} 1 :- guest(G). (2) I made this observation too, using a little bit of high school algebra. Rule (1) generates a function from tables to m-element subsets of the set of guests. The number of such functions is choose(m^2,m)^m. Since the binomial coefficient choose(m^2,m) approximately equals (m^2)^m, or m^(2m), the number of functions approximately equals (m^(2m))^m, or m^(2m^2). Rule (2), on the other hand, generates functions from guests to tables; the number of such functions is (m^m)^m, or m^(m^2). That's how we did mathematics in the good old times, when computers were not easily available. On another subject: the additional constraint that makes P4 so much faster % every guest must be seated :- {at(G,T):table(T)} 0, guest(G). is familiar to us from Veena's solution (Program 15), where it is expressed differently: seated(G) :- at(G,T), guest(G), table(T). :- guest(G), not seated(G). It would be interesting to check whether there is a significant difference in efficiency between these two versions. ============= Date: Mon, 26 Feb 2001 From: Paolo Ferraris To: Jonathan Campbell Your experiment and the results seem interesting. I have tried other programs similar to programs P3 and P4. One of those is obtained from P3 or P4 by substituting the rules that differentiate P3 from P4 with { at(G,T) : guest(G) : table(T)}. :- 2 {at(G,T) : table(T)}, guest(G). :- {at(G,T) : table(T)} 0, guest(G). :- m + 1 {at(G,T) : guest(G)}, table(T). The first of these four rules is our selection rule (a big rule when grounded), and all the boundaries are expressed by constraints. I had not enough time to make exhaustive experiments (only a few randomly generated test cases) so I cannot express any conclusion, but it seems that the execution times are very similar to those of program P3 in both solvable and unsolvable problems.