Expressing "More" in Answer Set Programming Date: Thu, Sep 1 2005 From: Yuliya Lierler To: TAG I am reading a paper on natural language understanding in which the author uses model-generation tools for first-order logic. One of the difficulties he has to deal with is that first-order logic doesn't have symbols corresponding to the words "more" and "most." The question that occurred to me is whether we face the same difficulty in ASP. Consider the following problem: In a group of n men and m women, some are Republicans and the others are Democrats. We know that (1) there are more Republicans among men than among women, and (2) among men, there are more Republicans than Democrats. The goal is to write a logic program whose answer sets represent the possible sets of Republicans satisfying these conditions. I was not able to do this in a natural way, without using "tricks." Can you help me? ==================== Date: Thu, Sep 1 2005 From: Gerald Pfeifer To: Yuliya Lierler Have you looked at DLV's aggregates? ==================== Date: Thu, Sep 1 2005 From: Wolfgang Faber To: Yuliya Lierler I think "more" should be feasible in some way using aggregates (cardinality constraints). "Most" seems to be difficult, as its meaning seems to be less precise, sometimes we might mean "more than 50%", sometimes "100-epsilon %", sometimes "more than 2/3" etc. > I was not able to do this in a natural way, without using "tricks." Can > you help me? Are cardinality constraints ok or a trick? The difficulty is probably that you cannot compare two aggregations directly in lparse/DLV syntax, i.e. you cannot write (in DLV syntax) #count{X: republican(X), male(X)} > #count{X: republican(X), female(X)} but this could also be written as #count{X: republican(X), male(X)} = N, #count{X: republican(X), female(X)} < N or #count{X: republican(X), male(X)} > N, #count{X: republican(X), female(X)} = N The problem here is that N is somehow unsafe (in a weak sense). Since republican will be a predicate that is not fully computed during grounding, the above will not work in current DLV; we know how to implement that but did not find time to do it so far. One could also write #count{X: republican(X), male(X)} > N, #count{X: republican(X), female(X)} < N but this would be really unsafe (the value for N is not bound in any way). The workaround would be to define a predicate which is true for all possible cardinalities of N, but this is not very elegant. I guess similar arguments hold also for lparse+backend systems. I found one other issue in the example: In a German equivalent, people might have intended to express by (1) that the percentage of Republicans among men is higher than the one among women. I don't know whether the same happens for English. Strictly speaking, if this was intended, the sentence is not correct, but I think such mistakes occur quite frequently (in newspapers etc). =================== Date: Thu, Sep 1 2005 From: John McCarthy To: TAG I don't think that logic should be extended to include more and less. Rather set theory formalized within logic should be used to handle these concepts. ==================== Date: Thu, Sep 1 2005 From: Monica Nogueira To: Yuliya Lierler Is it possible for a male (female) to be neither Republican nor Democrat? ==================== Date: Thu, Sep 2 2005 From: Yuliya Lierler To: Monica Nogueira It is possible that a male (female) is neither Republican nor Democrat. ==================== Date: Thu, Sep 2 2005 From: Yuliya Lierler To: Wolfgang Faber The ambiguity of most is definitely an issue but we could for a start agree on its understanding as ''more than 50 %''. In natural language similar issues come even with what would seem more trivial quantifier/determiner as one, two... In some context they may stand for exactly one, exactly two ... and in other they would stand for at least one, at least two ... . Consider sentences as (1) Two men are needed to lift the piano. (2) Two men are lifting the piano. In case of (1) if we try to encode this statement formally we probably pick the meaning of ''two'' as at least two men. In case of (2) we probably pick exactly two as a meaning of ''two''. But let us go back to an interesting topic of ''more''. >Are cardinality constraints ok or a trick? Use of cardinality constraints is not a trick, it is a use of especially designed machinery for encoding properties over sets. In case of comparing the numbers of men and women who are republicans, we are comparing the cardinalities of two sets, where one stands for the set of republican men and another for the set of republican women. Hence using cardinality constraints for this task looks as a proper choice. Unfortunately if I understand correctly use of cardinality constraints for the case of expressing ''more'' is not trivial. As you mentioned syntax of lparse/DLV does not allow something as #count{X: republican(X), male(X)} > #count{X: republican(X), female(X)}. The problem seem to be in the fact that the boundary on the sets is not known in advance and we may not simply use a constant to specify boundaries on the sets. Therefore we face the difficulty of using cardinality constraints in a straightforward manner. From your email it looked to me as there is a good news and it is already known how to implement cardinality constraints with variable boundaries (where would it be possible to find the details on this?), and the less good news is that it is not yet available. >The difficulty is probably that you cannot compare two aggregations >directly in lparse/DLV syntax, i.e. you cannot write (in DLV syntax) > >#count{X: republican(X), male(X)} > #count{X: republican(X), female(X)} > >but this could also be written as > >#count{X: republican(X), male(X)} = N, #count{X: republican(X), >female(X)} < N >or >#count{X: republican(X), male(X)} > N, #count{X: republican(X), >female(X)} = N > >The problem here is that N is somehow unsafe (in a weak sense). Since >republican will be a predicate that is not fully computed during >grounding, the above will not work in current DLV; we know how to >implement that but did not find time to do it so far. > >One could also write >#count{X: republican(X), male(X)} > N, #count{X: republican(X), >female(X)} < N >but this would be really unsafe (the value for N is not bound in any >way). > Is syntax you mentioned last implemented in DLV? I tried the following program: male(john). republican(john). male(matt). republican(matt). female(joana). republican(joana). female(luise). democrat(luise). moreMaleRepublicans:-#count{X: republican(X), male(X)} > N, #count{X: republican(X), female(X)} < N. Below is DLV output: __________________________ tag/more> dlv moreAgregate.dlv DLV [build BEN/Feb 23 2005 gcc 2.95.4 20011002 (Debian prerelease)] moreAgregate.dlv: line 5: Rule is not safe. Aborting due to parser errors. __________________________ >The workaround would be to define a predicate which is true for all >possible cardinalities of N, but this is not very elegant. > How would you do this? To provide a meaning for ''tricky encoding'' expression, below is such an encoding of (1) there are more Republicans among men than among women _________________________ male(john). republican(john). male(matt). republican(matt). female(joana). republican(joana). female(luise). democrat(luise). %auxilary id(0..3). %femRep predicates specify females that are republicans femRep(W):- female(W), republican(W). %each female republican receives an id 1{idFemaleRep(W, X):id(X)}1:-femRep(W). %IDs to female republicans shell be assigned starting from 0 idFem(X):-idFemaleRep(W, X), id(X), femRep(W). :-not idFem(0), femRep(W). %female republicans shell have different IDs :-idFemaleRep(W, X), idFemaleRep(W1, X), W!=W1, femRep(W;W1), id(X). %IDs to female republicans shell be assigned consequently, i.e. % first lady receives id 0, next one id 1, next one id 2 and so on :-idFem(X+1), not idFem(X), id(X). %Number of female republicans is 1 + the highest assigned id to %republican woman numFemaleRepublicans(X+1):-idFem(X), not idFem(X+1), id(X). %The same information is encoded for male %mRep predicates specify male that are republicans mRep(M):- male(M), republican(M). %each male republican receives an id 1{idMaleRep(M, X):id(X)}1:-mRep(M). %IDs to male republicans shell be assigned starting from 0 idMale(X):-idMaleRep(W, X), id(X), mRep(W). :-not idMale(0), mRep(W). %male republicans shell have different IDs :-idMaleRep(M, X), idMaleRep(M1, X), M!=M1, mRep(M;M1), id(X). %IDs to male republicans shell be assigned consequently, i.e. % first lady receives id 0, next one id 1, next one id 2 and so on :-idMale(X+1), not idMale(X), id(X). %Number of male republicans is 1 + the highest assigned id to %republican man numMaleRepublicans(X+1):-idMale(X), not idMale(X+1), id(X). moreMaleRepublicans:-numMaleRepublicans(X), numFemaleRepublicans(Y),id(X;Y),X>Y. hide. show numFemaleRepublicans(X). show numMaleRepublicans(X). show moreMaleRepublicans. ==================== Date: Sat, Sep 17 2005 From: Gerald Pfeifer To: TAG > From your email it looked to me > as there is a good news and it is already known how to implement > cardinality constraints with variable boundaries (where would it be > possible to find the details on this?), and the less good news is that it > is not yet available. Once this is available in DLV -- do you have an idea when that might be, Wolfgang? -- I volunteer to also implement support for the form above (#count{...} > #count{...}). =================== Date: Fri, Sep 30 2005 From: Veena Mellarkod To: TAG Here is my quick attempt on the program using ASP language. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% const n = 2. const m = 3. range(card,0..5). men(1..n). women(1..m). {X : repub_men(X) } subset {X : men(X)}. {X : repub_women(X)} subset {X : women(X)}. demo_men(X) :- not repub_men(X), men(X). demo_women(X) :- not repub_women(X), women(X). rmen(M):- is(card, {X : repub_men(X)},M), range(card,M). rwomen(W):- is(card, {X : repub_women(X)},W), range(card,W). :- rmen(M), rwomen(W), range(card,M), range(card,W), le(M,W). dmen(M) :- is(card, {X : demo_men(X)},M), range(card,M). :- dmen(D), rmen(R), range(card,D), range(card,R), le(R,D). compute 0,{}. show repub_men. show repub_women. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% The answer sets for this program are: Answer Set #1 : repub_men(1), repub_men(2) Answer Set #2 : repub_women(1), repub_men(1), repub_men(2) Answer Set #3 : repub_women(2), repub_men(1), repub_men(2) Answer Set #4 : repub_women(3), repub_men(1), repub_men(2) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% We need to specify the range of the cardinality function, range(card,0..5). Monica, in this program men (women) are either republicans or democrats. So, the others in the model are all democrats. =================== Date: Sat, Oct 1 2005 From: Michael Gelfond To: TAG As far as I understand there are several types of questions. 1. The first deals with syntactic sugar (ASET has a simple experimental implementation and does not have much to say about this. Not difficult to add though). 2. The second seems to deal with the requirement for an upper and lower bound on the range of aggregates. ASET uses grounding similar to that of LPARSE and so does not add anything new to the discussion. One approach to avoiding this is outlined in our new paper S. Baselice, P. A. Bonatti, and M. Gelfond. Towards an Integration of Answer Set and Constraint Solving. which will appear in proceedings of ICLP05. (Meanwhile it can be found on the KRlab webpage at TTU.) The idea is not to ground certain variables (e.g. values of the aggregates in the counting example from the discussion) and use constraints solvers instead. This will eventually be expended and implemented but it is not something I expect to be done in a month. 3. There is a comment by John McCarthy who says: ''I don't think that logic should be extended to include more and less. Rather set theory formalized within logic should be used to handle these concepts.'' I think that the logic of logic programs with aggregates satisfies this condition but am not absolutely sure that I understand John correctly. 4. Finally, I have a question for Yulia: You mention two sentences: (1) Two men are needed to lift the piano. (2) Two men are lifting the piano. Intuitively, they can be translated into two rather different formal statements, e.g. (in very simplified form) (1) ==> :- occurs(lift_piano,T), |{X : actor(X,lift_piano)}| < 2. (2) ==> is_occurring(lift_piano,T). :- |{X : actor(X,lift_piano)}| <> 2. Can your linguistic tools produce such a translation? ==================== Date: Fri, Oct 21 2005 From: Yuliya Lierler To: Veena Mellarkod Could you please help me to understand the details on the semantic of a predicate as is(card, {X : repub_men(X)},M) that appears in the rule rmen(M):- is(card, {X : repub_men(X)},M), range(card,M). ==================== Date: Fri, Oct 21 2005 From: Veena Mellarkod To: Yuliya Lierler The language I used is ASet-Prolog, developed by Dr. Gelfond and Mary Heidt. The rule: is(card,{X : repub_men(X)},M) can be read as: 'The cardinality of set X such that X is a republican men is M'. ==================== Date: Fri, Oct 21 2005 From: Yuliya Lierler To: Michael Gelfond > You mention two sentences: > (1) Two men are needed to lift the piano. > (2) Two men are lifting the piano. > > Intuitively, they can be translated into two > rather different formal statements, e.g. > (in very simplified form) > > (1) ==> > :- occurs(lift_piano,T), |{X : actor(X,lift_piano)}| < 2. > > (2) ==> > is_occurring(lift_piano,T). > :- |{X : actor(X,lift_piano)}| <> 2. > > Can your linguistic tools produce such a translation? > We can state the main difference in giving the translation for sentences (1) and (2) as follows. In case of sentence (1) quantifier TWO takes a meaning of "at least two", while in case of sentence (2) quantifier's meaning is "exactly two". Most of the work in natural-language semantics is devoted to the issues of developing formal representation language specifying NL utterance semantics suitable for language oriented inference. Let us call such languages FS. Less work is done in developing inference engines themselves in the area. Lately there is work done in using first order logic tools as FOL model generators or FOL theorem proving for performing the inference on semantic representation of a sentence. If I take "linguistic tools" in your question to stand for the tools that output the semantic representation of a sentence in languages FS, then to the best of my knowledge such representation would be still ambiguous as using one quantifier "TWO" for both of its meanings "Exactly Two" and "At Least Two". Or rather FS language authors say that ''TWO'' is in principle ambiguous, but we leave this issue to be decided somewhere else and for us will consider only one explicit meaning of ''TWO'', i.e. ''Exactly Two''. Languages FS leave such ambiguities to be resolved elsewhere with the use of pragmatics, word-sense (lexical semantics), world knowledge, common sense reasoning. There is a work by K. Konrad who developed an inference engine for FS language that takes into account generalized quantifiers such as "TWO", "Most"... [Model Generation for Natural Language Interpretation and Analysis, PhD thesis by Karsten Konrad, 2000]. In Konrad's thesis he states the definition of quantifier "TWO" to take a meaning of "Exactly Two", on the other hand in the logic he proposes it is possible to state the other meaning of quantifier as well. But I am not familiar with the work that would disambiguate quantifiers as "TWO" on the level of FS languages. And most probably there is a good reason for that. In order to disambiguate its meaning there is a certain type of inference or word's sense analysis or deeper syntactic analyses has to be preformed before one can state which meaning quantifier possesses. Here I will allow myself to speculate on some linguistical reasoning, there for sure are some linguists that would disagree with this line of reasoning and see issues differently:-) In case of sentence (2) the fact that present contineous tense is used would probably be sufficient to conclude that TWO stands for "exactly two", while in sentence (1) it is important to know the semantics of "needed" to conclude that TWO stands for "at least two". In this sentence ''needed'' stands for ''necessary'', that on the one hand implies that at least two people are necessary to lift the piano, on the other hand does not pose a constraint that exactly two people shall be performing this action. ==================== Date: Fri, Oct 21 2005 From: Yuliya Lierler To: TAG Below is part of the message that I sent to TAG on Sep. the 2d with respect to DLV encoding of ``more than''. >>Wolfgang Faber: >>The difficulty is probably that you cannot compare two aggregations >>directly in lparse/DLV syntax, i.e. you cannot write (in DLV syntax) >> >>#count{X: republican(X), male(X)} > #count{X: republican(X), female(X)} >> >>but this could also be written as >> >>#count{X: republican(X), male(X)} = N, #count{X: republican(X), >>female(X)} < N >>or >>#count{X: republican(X), male(X)} > N, #count{X: republican(X), >>female(X)} = N >> >>The problem here is that N is somehow unsafe (in a weak sense). Since >>republican will be a predicate that is not fully computed during >>grounding, the above will not work in current DLV; we know how to >>implement that but did not find time to do it so far. >> >>One could also write >>#count{X: republican(X), male(X)} > N, #count{X: republican(X), >>female(X)} < N >>but this would be really unsafe (the value for N is not bound in any >>way). >> > Yuliya: >Is syntax you mentioned last implemented in DLV? I tried the following >program: Program(Rep): >male(john). republican(john). >male(matt). republican(matt). >female(joana). republican(joana). >female(luise). democrat(luise). >moreMaleRepublicans:-#count{X: republican(X), male(X)} > N, #count{X: >republican(X), female(X)} < N. > >Below is DLV output: >__________________________ >tag/more> dlv moreAgregate.dlv >DLV [build BEN/Feb 23 2005 gcc 2.95.4 20011002 (Debian prerelease)] > > >moreAgregate.dlv: line 5: Rule is not safe. >Aborting due to parser errors. >From personal communication with Wolfgang Faber, I discovered one interesting fact for this discussion. Current implementation of DLV is able to take variable boundary in the cardinality constraints whenever during the time of grounding DLV can itself compute the constant that has to replace this variable. For instance, in case of cardinality constraint :-#count{X: republican(X), male(X)} = N, #count{X:republican(X), female(X)} < N. whenever republican and male predicates are fully computed during the grounding, DLV may then assign the constant value to N at the time of grounding and proceed with its computation. If we consider above program Program(Rep) predicates male and republican are fully computed during the grounding. Therefore it is possible to use current implementation of DLV to find program's answer sets if the rule in line 5: moreMaleRepublicans:-#count{X: republican(X), male(X)} > N, #count{X:republican(X), female(X)} < N. is replaced by the rule: moreMaleRepublicans:-#count{X: republican(X), male(X)} = N, #count{X:republican(X), female(X)} < N. ==================== Date: Fri, Oct 21 2005 From: Marcello Balduccini To: Yuliya Lierler > Languages FS leave such ambiguities to be resolved > elsewhere with the use of pragmatics, word-sense (lexical semantics), > world knowledge, common sense reasoning. Based on what I learned by building our translator from natural language to A-Prolog, I would be surprised if at some point it became possible to do this type of disambiguation in FS without the use of world knowledge and deep reasoning. Sometimes disambiguation requires not only common-sense knowledge about the words in the particular sentence, but also a decent understanding (based on reasoning) of the story described by the preceding sentences. > in > sentence (1) it is important to know the semantics of "needed" to > conclude that TWO stands for "at least two". In this sentence ''needed'' > stands for ''necessary'', that on the one hand implies that at least two > people are necessary to lift the piano, on the other hand does not pose > a constraint that exactly two people shall be performing this action. It is worth stressing the importance of common-sense knowledge here: the interpretation "at least" instead of "exactly" doesn't depend just on the use of the word "need", but also on the context. Consider the sentence "2 keys are needed to open this door". The default interpretation in this case is that exactly 2 keys are needed. ==================== Date: Fri, Oct 21 2005 From: Alessandro Provetti To: TAG It interesting to follow this discussion generated by Yuliya's example. I only wanted to say that the activity of making the meaning of the quantifiers precise sounds a lot like Word sense disambiguation (WSD) to me. WSD is a field in its own right by now, eg.: http://www.aclweb.org/acl2005/index.php?tutorial-mp and there is this a -potentially- very interesting activity: using volunteers to teach computers the meaning of words in context: http://teach-computers.org/ Interestingly, one of the most active people, and a real authority in this this field, Rada Mihalcea is located in North Texas: http://www.cs.unt.edu/~rada/ The two of us are in the PC of this AAAI Spring Symposium: http://www.umbriacom.com/aaai2006_weblog_symposium/ which I invite you to consider. perhaps it could also become a venue for the type of formalization activity Yuliya is proposing. ==================== Date: Fri, Nov 4 2005 From: Yuliya Lierler To: Veena Mellarkod and TAG Let us compare the encoding of the (1) there are more Republicans among men than among women requirement of ''more than...'' problem in the solution that you posted in ASetProlog and the encoding that system DLV is able to process. In ASet-Prolog the code that encodes requieremnet (1) follows (ASetPr): range(card,0..5). (0) rmen(M):- is(card, {X : repub_men(X)},M), range(card,M). (1) rwomen(W):- is(card, {X : repub_women(X)},W), range(card,W). (2) :- rmen(M), rwomen(W), range(card,M), range(card,W), le(M,W). (3) The code that would encode similar condition in DLV is presented below (DlvPr): moreMaleRepublicans:-#count{X: republican(X), male(X)} = N, (4) #count{X:republican(X), female(X)} < N. :-not moreMaleRepublicans. (5) As mentioned in our previous discussion at the moment DLV can process rule (4) only in case if the cardinality of sets {X: republican(X), male(X)}, and {X:republican(X), female(X)} can be determined at the time of grounding, in other words N can be assigned a constant at the time of grounding. For instance, such condition is satisfied for the program Rep consisting of following rules: male(john). republican(john). male(matt). republican(matt). female(joana). republican(joana). female(luise). democrat(luise). But if we take Veena's ASet-Prolog program: men(1..n). women(1..m). {X : repub_men(X) } subset {X : men(X)}. {X : repub_women(X)} subset {X : women(X)}. demo_men(X) :- not repub_men(X), men(X). demo_women(X) :- not repub_women(X), women(X). that states (i) any man may be republican, (ii) any woman may be republican,(iii) every man who is not a republican is a democrat, (iiii) every woman who is not a republican is a democrat, cardinality N of #count{X: repub_men(X), men(X)} = N expression could not be calculated at the time of grounding. For the case of ASet-Prolog: rules as (1,2) are grounded with the use of cardinality range given by rule (0), hence cardinality ''N'' stays a variable. Here I have a question to DLV group: - is it possible similarly to ASet-Prolog to allow range to N in rule (4) within DLV encoding? And I have another question for Aset-Prolog group: - is it possible within ASet-Prolog implementation to optimize grounding of cardinality condition when at the time of grounding the cardinality of set may be assigned to some constant as in case of Rep program? ==================== Date: Fri, Nov 4 2005 From: Veena Mellarkod To: Yuliya Lierler Yes, if the cardinality of the set is known then we can optimize the grounding. I believe it should be possible in ASET Implementation, it is not yet done. ==================== Date: Sat, Nov 5 2005 From: Nicola Leone To: Yuliya Lierler > (1) there are more Republicans among men than among women > > The code that would encode similar condition in DLV is presented below > (DlvPr): > > moreMaleRepublicans:-#count{X: republican(X), male(X)} = N, (4) > #count{X:republican(X), female(X)} < N. > :-not moreMaleRepublicans. (5) > > As mentioned in our previous discussion at the moment DLV can process > rule (4) only in case if the cardinality of sets {X: republican(X), > male(X)}, and {X:republican(X), female(X)} can be determined at the time > of grounding, in other words N can be assigned a constant at the time of > grounding. Note that it's sufficient that ONE of the cardinality of the two sets is determined at the time of grounding to instantiate N an let DLV wor fine. > > But if we take Veena's ASet-Prolog program: > men(1..n). > women(1..m). > > {X : repub_men(X) } subset {X : men(X)}. > {X : repub_women(X)} subset {X : women(X)}. > > demo_men(X) :- not repub_men(X), men(X). > demo_women(X) :- not repub_women(X), women(X). > > that states (i) any man may be republican, (ii) any woman may be > republican,(iii) every man who is not a republican is a democrat, (iiii) > every woman who is not a republican is a democrat, cardinality N of > #count{X: repub_men(X), men(X)} = N expression could not be calculated at > the time of grounding. > > For the case of ASet-Prolog: rules as (1,2) are grounded with the use of > cardinality range given by rule (0), hence cardinality ''N'' stays a > variable. > > Here I have a question to DLV group: > - is it possible similarly to ASet-Prolog to allow range to N in rule (4) > within DLV encoding? The answer is Yes, from the viewpoint of language expressiveness there is no problem at all; but we discourage this practice, since it might lead to an exponential instantiation-size. Define the legal N-values by a predicate, and then use this predicate in the rule body to make N safe: #maxint=3. cardMaleSet(0..3). moreMaleRepublicans:- cardMaleSet(N), #count{X: republican(X), male(X)} = N, #count{X: republican(X), female(X)} < N. should work even if the first set is not "solved" during grounding. The problem of this encoding is that, for each value of N in [0..3], you generate one ground rule for each set of male persons of size N. If the party of the male persons is completely unknown (we only know that every male is either republican or democrat), then the number of ground rules will be 2^N, where N is the number of male persons! [Actually, I'm curious to know whether the size of the instantiation generated by Aset-Prolog remains small in this case.] This is the reason why we discourage these encodings, and have designed a new technique, allowing to avoid the addition of this sort of "domain predicate" for N, causing the exponential blow up of the size of the ground instantiation. Unfortunately, the new technique has a strong impact on all modules of DLV, and its implementation will take some time. But we do believe that it will be a relevant enhancement.