Are We Doing the Right Thing?

Date: Thu, 7 Oct 2004
From: Vladimir Lifschitz
To: TAG

I'm beginning to think that something is fundamentally wrong with our
knowledge representation formalisms designed to describe properties of
actions.  These formalisms are supposed to provide formal notation into
which we can translate English declarative sentences having to do with
action domains.  It seems to me that these formalisms don't really allow
us to do that.  They may be interesting and valuable, but they don't
accomplish what they are supposed to.

We have known examples of translating declarative sentences into formal
notation since high school, where we used algebra to solve "word
problems."  Here is an example (Word Problems for Kids, Grade Eight,
No. 39, from http://www.stfx.ca/special/mathproblems/grade8.html).

Problem:

    The length of a rectangle is four times as long as its width. The
    area of the rectangle is 100 metres squared. What are the dimensions
    of the rectangle?

Solution:

    x  = width
    4x = length

  x*4x = 100 meters squared
  4x*x = 100
   x*x = 25
     x = 5 = width
    4x = 20 = length

Each of the two declarative sentences in the statement of the problems
corresponds to a line in the solution:

    The length of a rectangle is              4x = length
    four times as long as its width.

    The area of the rectangle is            x*4x = 100 meters squared
    100 metres squared.

Every other line in the solution is either explanation of notation
(x = width) or a consequence of what has been proved earlier.  This is
an example of successful knowledge representation.

Let's look now at what happens when we represent properties of actions.
In the classical 1969 paper "Application of theorem proving to problem
solving" Cordell Green describes the Monkey and Bananas domain in
English:

    The monkey is faced with the problem of getting a bunch of bananas
    hanging from the ceiling just beyond his reach.  To solve the
    problem, the monkey must push a box to an empty place under the
    bananas, climb on top of the box, and them reach them.

Then he axiomatizes this domain in the situation calculus.  If we look
at his axioms MB1-MB8, we'll see only one that is a genuine translation
of a part of the English description: "an empty place under the bananas"
turns into

     MB3. (forall x) -AT(x,under-bananas,s0).

But nothing in Green's axioms corresponds to any part of the first
English sentence: to "faced with the problem", or to "getting a bunch
of bananas", or to "hanging from the ceiling", or to "just beyond his
reach".  Instead, these axioms say various things that are not even
remotely related to the English text that we are trying to represent,
for instance:

     MB1. MOVABLE(box).

     MB4. (forall b,p1,p2,s)[[AT(b,p1,s) & MOVABLE(b) &
             (forall x) -AT(x,p2,s)] ->
             [AT(b,p2,move(monkey,b,p2,s)) &
              AT(monkey,p2,move(monkey,b,p2,s))]].

KR formalisms available today are more sophisticated in some ways
than Green's form of the situation calculus (for instance, they use
nonmonotonic logic to solve the frame problem).  But descriptions of
the Monkey and Bananas domain in these languages  (see, for example,
AIJ 153 (2004), pp. 72, 73) are not closer to English than the 1969
description.

This bothers me.  What are your thoughts about it?

==============

Date: Sun, 10 Oct 2004
From: Gregory Gelfond
To: Vladimir Lifschitz

It doesn't bother me.  Here's why:

Recall your example:

The length of a rectangle is four times as long as its width. The area of the
rectangle is 100 metres squared. What are the dimensions of the rectangle?

The problem is given in a very restricted subset of English dealing with
mathematics.  As a result, the meaning of a statement can be extracted without
relying on commonsense reasoning.  Even so, consider the solution/encoding of
the problem:

1.       x = width
2.      4x = length
3.  x * 4x = 100 meters squared
4.  4x * x = 100
5.   x * x = 25
6.   x = 5 = width
7. 4x = 20 = length

We still need to encode, "hidden" information.  Specifically, that the area of
a rectangle = length * width (line 3).  When we encode our action descriptions
in A-Prolog we do something similar to this level of translation.  The primary
difference seems to be that we no longer are restricting ourselves to a limited
subset of English, and therefore the amount of hidden information we need to
make explicit is much greater.

Consider our recent TAG problem of fighting for a snack:

I have an apple, a banana and a carrot.  My neighbor on the left fights with me
for the apple; maybe he will succeed in taking it away from me, maybe he won't.
After that, my neighbor on the right fights with me for the banana.  After these
two events, who are the possible owners of the apple, the banana and the carrot?

This problem has considerably more hidden information in it than the previous
one, and yet the solution is somewhat similar.

First we introduce our types.  This seems analogous to lines 1 and 2 of the
algebraic solution.  Already though, there are many more types, that we deal
with.  Some these are explicitly stated in the problem description: people, and
snacks.  The others though, are not stated directly but implied: time, and
fluents, and later actions.

% type definitions

time(0..2).
person(me; left; right).
snack(apple; banana; carrot).
fluent(has(P,S)).

The domain history has an intuitive, translation, analogous to the algebraic
solution:

% domain history

h(has(me,apple), 0).
h(has(me,banana), 0).
h(has(me,carrot), 0).

o(fight(left,me,apple), 0).
o(fight(right,me,banana), 1).

Now what must we do?  Unlike the algebraic solution, we must introduce a great
deal of additional information.  For example:

% if two people fight for a snack, then one of them has possession of it

h(has(P1,S), T + 1) | h(has(P2,S), T + 1) :- o(fight(P1,P2,S), T).

There is another possible translation of the effects of fighting, caused by
conflicting commonsense views.

h(has(neither,S), T + 1) |
h(has(P1,S), T + 1)      |
h(has(P2,S), T + 1)      :- o(fight(P1,P2,S), T).

When fighting for a prize, is it he case that one party gets it in the end?

This difficulty comes from the varied interpretations of natural language
statements.  In your example, no such difficulty exists.

Why?  My view is that really, the problem descriptions we deal with are really
incomplete.  They rely on a great deal of hidden information that is hidden
in the meaning of the English sentences, and in a commonsense body of knowledge
that accompanies it.  As a result, we must only translate the problem, but fill
in the missing parts of the description, and resolve ambiguities.

Also, if we consider problems such as graph coloring, the method for
solving such problems is identical to that of solving the algebraic problem.

===========

Date: Mon, 11 Oct 2004
From: Vladimir Lifschitz
To: Gregory Gelfond

> We still need to encode, "hidden" information.  Specifically, that the
> area of a rectangle = length * width (line 3).

Good point.  The solution quoted in my message can be improved by showing
explicitly the "hidden information," or "background knowledge," that it's
based on.  Something like this:

---------------------------------------------------------------------------

Abbreviations:

    x = width
    y = length
    a = area

Given facts:

    y = 4x     (the length is four times the width)
    a = 100    (the area is 100)

Background knowledge:

    a=xy

---------------------------------------------------------------------------

Having done that, we can use algebra to calculate the width and the length.

> When we encode our action descriptions in A-Prolog we do something
> similar to this level of translation.  The primary difference seems
> to be that we no longer are restricting ourselves to a limited
> subset of English,

I would say: we restrict ourselves to a less limited subset of English.

> and therefore the amount of hidden information we need to
> make explicit is much greater.

Right.  I have a question then: in formalizations of action domains, can
we identify this background knowledge, as you did in the example with the
rectangle, so that our formalization will be cleanly divided into two
parts--a body of background knowledge and a translation of the given
English sentences?

We can try to achieve that goal simply by partitioning an existing
formalization into a "background part" and a "translation part."  But it
seems to me that in many cases this will not be possible.   For instance,
in the solution to the snack problem that you posted last October we read:

> %% If two people fight for a piece of fruit, then one of them ends
> %% up having possession of it.
> h(has(Person1, Fruit), T + 1) | h(has(Person2, Fruit), T + 1) 
>                              :- o(fightFor(Person1, Person2, Fruit), T).
>
> %% Two people cannot fight for a piece of fruit if neither of them has
> %% it to begin with.
> :- o(fightFor(Person1, Person2, Fruit), T), -h(has(Person1, Fruit), T),
>   -h(has(Person2, Fruit), T).

We do have some background knowledge about fighting, and also background
knowledge about fruits.  But you'll agree that "fighing for a piece of
fruit" is an extemely specialized topic, so special that we can't claim
to have anything on this peculiar subject in our background.  Your two rules
look like special cases or immediate consequences of something much more
general.  Can you identify those more general facts?

In the case of the monkey and bananas domain, we are in bigger trouble,
I think.  One of the postulates found in the usual formalizations of that
domain says that

     if the monkey is on the box under the bananas then              (*)
     he can grasp the bananas.

This is certainly not part of our background knowledge! We all know a
little bit about boxes, and also a little bit about bananas, but not
anything like (*).  Neither is (*) a translation of anything from the
English description of the domain that I quoted in the previous message.
I'd say (*) looks like a consequence of some piece of background knowledge
with the fact that the word "just" is used in the phrase "hanging from the
ceiling just beyond his reach" in the statement of the problem.  This word
implies apparently that if the monkey stands on an object of a nontrivial
height that he will be able to reach the bananas.  Can we make this idea
precise?

Your reply helped me identify WHAT is wrong with our work on formalizing
knowledge about actions.  We do not try to  structure it as you propose:
background + translation.  A postulate can be counted as part of background
if it is of general nature, if it's not oriented specifically towards the
problem under consideration.  On the other hand, a postulate can be 
counted as part of the translation if it corresponds to a piece of the
English-language description of the problem designed for humans.  What if
we try to present our formalizations in the "background + translation"
format?

===============

Date: Mon, 11 Oct 2004
From: Michael Gelfond
To: Vladimir Lifschitz

I agree with Greg, which explains my mild dislike for
some of the examples we discussed so far. Many
of them  are interesting computationally but not from the stand
point of KR.

Some answers to Vladimir's questions:

Vlad:
I have a question then: in formalizations of action domains, can
we identify this background knowledge, as you did in the example with  
the
rectangle, so that our formalization will be cleanly divided into two
parts--a body of background knowledge and a translation of the given
English sentences?

Michael:
Very often we can.  Here are two examples:

Example 1.
If one looks at our work on the USA
advisor you will see that the main body of commonsense
knowledge is given by a number of modules which constitute
Simple Theory of Electro-Mechanical Systems.

A particular system, say RCS, is obtained by translation
from diagrams. A collection of faults and a goal can be translated
from English.


Example 2.
Recently Chitta Baral,Richard Scherl, and I got involved
in the use of ASP for answering questions related to
natural language understanding.

See
Chitta Baral, Michael Gelfond, and Richard Scherl.
Using answer set programming to answer complex queries.
In Workshop on Pragmatics of Question Answering at HLT-NAAC2004
  (Human Language Technology - Annual Meeting for North
American Association  for Computational Linguistics), May 2004.
http://www.krlab.cs.ttu.edu///Papers/

In the next message 

[ see http://www.cs.utexas.edu/tag/travelmodule ]

I'll attach a file containing a travel module (actually it may contain
some other info which comes from other modules, e.g.
people names, months, etc, but not much of it.)

The message also includes scenarios each containing a simple story,
a question, and their translations.

As you will see translations of the scenarios are rather simple.

The formalization is primarily used for discussions and not
intended to be a "final" product. Any comments, suggestions, etc.
will be greatly appreciated.


Vlad:
We can try to achieve that goal simply by partitioning an existing
formalization into a "background part" and a "translation part."  But it
seems to me that in many cases this will not be possible.   For  
instance,
in the solution to the snack problem that you posted last October we  
read:

%% If two people fight for a piece of fruit, then one of them ends
%% up having possession of it.
h(has(Person1, Fruit), T + 1) | h(has(Person2, Fruit), T + 1)
                              :- o(fightFor(Person1, Person2, Fruit), T).

%% Two people cannot fight for a piece of fruit if neither of them has
%% it to begin with.
:- o(fightFor(Person1, Person2, Fruit), T), -h(has(Person1, Fruit), T),
   -h(has(Person2, Fruit), T).

We do have some background knowledge about fighting, and also background
knowledge about fruits.  But you'll agree that "fighting for a piece of
fruit" is an extremely specialized topic, so special that we can't claim
to have anything on this peculiar subject in our background.  Your two  
rules
look like special cases or immediate consequences of something much more
general.  Can you identify those more general facts?


Michael.
Greg told me he'll do that.


Vladimir:
In the case of the monkey and bananas domain, we are in bigger trouble,
I think.  One of the postulates found in the usual formalizations of  
that
domain says that

      if the monkey is on the box under the bananas then              (*)
      he can grasp the bananas.

This is certainly not part of our background knowledge! We all know a
little bit about boxes, and also a little bit about bananas, but not
anything like (*).  Neither is (*) a translation of anything from the
English description of the domain that I quoted in the previous message.
I'd say (*) looks like a consequence of some piece of background  
knowledge
with the fact that the word "just" is used in the phrase "hanging from  
the
ceiling just beyond his reach" in the statement of the problem.  This  
word
implies apparently that if the monkey stands on an object of a  
nontrivial
height that he will be able to reach the bananas.  Can we make this idea
precise?


Michael.
We probably can. I am not sure that we have to though.
To deal with this problem it may be sufficient to have a suitable
notion of a location and of a "closeness" of two locations
(possibly w.r.t an agent which I ignore).

A general axiom from the background knowledge of grasping
may have a form:

executable(grasp(A,O)) if close(loc(A),loc(O)).

where A is an agent and O is an object.

The English translation of (*) can be simply:

close(loc(monkey),loc(banana)) if under(loc(box),loc(banana)),
                                   loc(monkey) = top(box).


Of course a fairly sophisticated and general
background theory of locations and movements is needed to support
the rest of the reasoning. It may have actions like
move O1 under O2, etc.

Vladimir:

Your reply helped me identify WHAT is wrong with our work on formalizing
knowledge about actions.  We do not try to  structure it as you propose:
background + translation.  A postulate can be counted as part of  
background
if it is of general nature, if it's not oriented specifically towards  
the
problem under consideration.  On the other hand, a postulate can be
counted as part of the translation if it corresponds to a piece of the
English-language description of the problem designed for humans.  What  
if
we try to present our formalizations in the "background + translation"
format?

Michael:
I agree.  My suggestion for the TAG members is to actually concentrate
on building what many people call "micro-theories" and on selecting examples
to test them. Going from examples to micro-theories seem to be
less interesting methodologically.

===========

Date: Mon, 11 Oct 2004
From: Gregory Gelfond
To: Vladimir Lifschitz

> We do have some background knowledge about fighting, and also background
> knowledge about fruits.  But you'll agree that "fighing for a piece of
> fruit" is an extremely specialized topic, so special that we can't claim
> to have anything on this peculiar subject in our background.  Your two rules
> look like special cases or immediate consequences of something much more
> general.  Can you identify those more general facts?

I believe we can.  What do you think of the following partitioning of  
my solution to the "fight for a snack" problem?:

%======================================================================= 
% direct translation
%======================================================================= 

h(has(me,apple),0).
h(has(me,banana),0).
h(has(me,carrot),0).

o(fightFor(left,me,apple), 0).
o(fightFor(right,me,banana), 1).

%======================================================================= 
% type information
%======================================================================= 

snack(apple).
snack(banana).
snack(carrot).

person(me).
person(left).
person(right).
#domain person(Person; Person1; Person2).

object(X) :- snack(X).

time(0..2).

%======================================================================= 
% theory of fighting for a prize
%======================================================================= 

#domain object(Object).
#domain time(T).

h(has(Person1,Object), T + 1) | h(has(Person2,Object), T + 1) :-
	o(fightFor(Person1,Person2,Object), T),
	not ab(Object, T + 1).

:- o(fightFor(Person1,Person2,Object), T),
	-h(has(Person1,Object), T),
	-h(has(Person2,Object), T).

-h(has(Person1,Object), T) :-
	h(has(Person2,Object), T),
	neq(Person1,Person2).

%======================================================================= 
% theory of ownership
%======================================================================= 

inertialFluent(has(Person,Object)).

%======================================================================= 
% general intertia axiom
%======================================================================= 

#domain inertialFluent(InertialFluent).

h(InertialFluent, T + 1) :-
	h(InertialFluent, T),
	not -h(InertialFluent, T + 1).

-h(InertialFluent, T + 1) :-
	-h(InertialFluent, T),
	not h(InertialFluent, T + 1).

%======================================================================= 
% smodels directives
%======================================================================= 

has(Person,Object,T) :- h(has(Person,Object),T).
hide.
show has(X,Y,Z).

===========

Date: Mon, 11 Oct 2004
From: Vladimir Lifschitz
To: Gregory Gelfond

I like your "theory of fighting for a prize" and "theory of ownership."
That's exactly the kind of background knowledge we talked about.

===========

Date: Mon, 11 Oct 2004
From: Vladimir Lifschitz
To: Michael Gelfond

Please clarify the place of the expression

   close(loc(monkey),loc(banana)) if under(loc(box),loc(banana)),
                                     loc(monkey) = top(box)

in your analysis of the monkey and bananas example.  It seems to me that
this is the kind of axiom that we'd like to avoid if we follow the
"background + translation" methodology.  This axiom is much too special
for the status of background knowledge, and this doesn't look like the
translation of any part of the English description of the domain.  Do
you agree?

===========

Date: Tue, 12 Oct 2004
From: Michael Gelfond
To: Vladimir Lifschitz

No, I don't, but I am probably missing something.
In my previous message I say that the above rule is a translation of

(*)    if the monkey is on the box under the bananas then
        he can grasp the bananas.

Are you objecting to the fact that the translation is not literal?

Will you prefer

> can_reach(monkey,banana) if under(loc(box),loc(banana)),
>                             loc(monkey) = top(box)

with the background theory containing something like:

:- o(grasp(A,O),T),
    not holds(can_reach(A,O),T)   ?

In the setting of a particular example we have a lot of freedom with selecting
our basic relations. If however we already have a theory relevant to the example
then part of this freedom is gone - the translation will depend on the existing
"target" language. This is why I had "close" - I used it before to describe
"neighboring" or "close proximity" locations (e.g. robot needs to move close
to the door to open it). So "close" can be viewed as given - belonging to the
target language.

I am not sure if it helps to clarify my previous message or makes it even
less clear.  Please let me know.

I am very happy though that we are having this discussion. I really
believe that finding the right methodology of formalization (and tentatively
agreeing on it), including building micro-theories, learning how to 
expand them and how to combine them in larger modules is one of the most
interesting challenges we are facing now.

So I hope the discussion will continue.

===========

Date: Tue, 12 Oct 2004
From: Vladimir Lifschitz
To: Michael Gelfond

> In my previous message I say that the above rule is a translation of
> 
> (*)    if the monkey is on the box under the bananas then
>         he can grasp the bananas.

Right.  My point was that (*) is something that we should NOT try to
represent in our formalizations, because the English statement of the
problem that I quoted from Cordell Green's paper doesn't include (*).
It says instead that

  (**) a bunch of bananas hangs from the ceiling just beyond
        the monkey's reach.

My complaint was that each of the usual formalizations of the Monkey and
Bananas domain (in the classical situation calculus, in logic programming,
in action languages) has a postulate corresponding to (*), and we have
never tried to translate (**).  Instead of translating what the English
says, we write something that allows us to solve the planning problem more
easily, and this is cheating!

===========

Date: Tue, 12 Oct 2004
From: Michael Gelfond
To: Vladimir Lifschitz

Sorry, my mistake.
I relied on my memory of your message which I read in Florida.
So I thought that (*) is part of the original story.
I'll try to be more careful next time (and save the messages).

May it be that you are addressing two problems:

(1)  Adding background knowledge to every example we are trying to
      formalize make our formalizations too complex.
      Development of micro-theories can, at least partially,
      remedy this problem.

(2)  During the formalization we decide what part of our commonsense
      knowledge is used by us in the process of translation and what
      part is formalized explicitly in the translation and/or background 
knowledge.

Is this a reasonable interpretation?

If so I definitely missed the item (2). I need to think more about it,
but here is an immediate reaction:

          When we translate a word problem
          into an algebraic equation or a puzzle
          into some set of constraints for a
          constraint solver to solve
          we have a similar problem, right?

          In these cases our translations
          depend on our knowledge of algebra
          or constraint satisfaction algorithms.

          So it maybe that our translation of
          monkey and banana problem will depend
          on existence of background macro-theories.
          If we have one which already deals with
          locations in three dimensional space,
          understands a notion of "border", etc.
          then we can stay closer to the Green's
          text. Otherwise we prefer to ignore some
          details. The most detailed translation
          is not always the best of course but how
          to get one is still very interesting.

Does this make sense?

Other points of my message are related to problem 1
above  and are still valid.
So comments are still appreciated.

===========

Date: Tue, 12 Oct 2004
From: Vladimir Lifschitz
To: Michael Gelfond

>           When we translate a word problem
>           into an algebraic equation or a puzzle
>           into some set of constraints for a
>           constraint solver to solve
>           we have a similar problem, right?

Yes.

>           In these cases our translations
>           depend on our knowledge of algebra
>           or constraint satisfaction algorithms.

How do they depend on the knowledge of algorithms?

===========

Date: Tue, 12 Oct 2004
From: Michael Gelfond
To: Vladimir Lifschitz

They depend on knowing the input algorithms understand.
For instance, I recently attended the talk where the speaker solved
a puzzle using CSP. The story contained information
about neighboring houses. The CSP encoding enumerated
the houses, so instead of saying neighbor(N,M) they say
|N - M| = 1. The corresponding CSP algorithm can easily
deal with the latter representation but not the former one.

===========

Date: Mon, 18 Oct 2004
From: Chitta Baral
To: TAG

I am very happy that Vladimir brought up this issue of formalizing
a set of connected sentences in English (a natural language)
and answering queries about it.

Following are some of my views (which echoes some of the views
of Michael and Greg).

When a human reasoner reads a paragraph and answers questions
about it, it is often the case that he or she uses additional background
knowledge. Thus by just formalizing the paragraph we may not be
able to answer the queries. We need to formalize the background
knowledge too.

In addition if the query involves `deep reasoning' notion's
we need to formalize that too. For example, if the query asks for a plan,
then we need to formalize what a `plan' is. If the query is about
finding an `explanation' we need to formalize what an explanation is.
Similar, about a `diagnosis', a `cause' etc.

In the past we have done this but normally without separating out the
above aspects. In doing that we have put in a lot of effort in the
`deep reasoning' aspects. (Most papers in reasoning about actions,
and its applications to planning, diagnosis, etc.)

I think the time has come for us to tackle the issue of formalizing various
background knowledge. This is a humongous effort, but having such
a formalization has a big payoff. Couple of decades back CYC was
such an attempt. I think the time was not quite right then, and it was 
done in a closed
company environment.

There are similar attempts in related fields.
(See for example wordnet http://www.cogsci.princeton.edu/~wn/,
verbnet, etc.)

Also, from the knowledge representation angle there is
projecthalo.com, about encoding knowledge various scientific
fields.

I was thinking and discussing with Michael to have a small
workshop in Tempe (in March during weekend before or after
ASU Spring break + 1 more day) on this topic. The goal would
be to make a start on developing a repository of background knowledge
encoded in a logical language with a precise semantics and at least
a prototype implementation.  If there is interest we can also have
a workshop before or after lpnmr in Italy.

For the workshop in Tempe, I could try to get some funding from
our department, AAAI, and the funding agency that funds Michael and me.

===========

Date: Mon, 18 Oct 2004
From: John McCarthy
To: TAG

I have followed the TAG discussion and applaud Vladimir raising the
issues.

I entirely agree with Chitta about the need to formalize background
knowledge and agree that it will require a big effort.  I'd like to
attend the workshop and present a paper.  As many of you know, I have
been working on these issues since 1958.  Many people have made
progress on relevant concepts.

I'm still improving situation calculus and now have version in mind
for which I have great hopes.  It's an extension of the version in my
KR2002 article "Actions and other events in situation calculus".  It
has the additional predicate Concludes(s,event,s').  Possibly this
idea isn't new.  I'll put up a paper about it when I have more.

===========

Date: Mon, 18 Oct 2004
From: Vladimir Lifschitz
To: Chitta Baral

> I think the time has come for us to tackle the issue of formalizing various
> background knowledge.

Yes, this is the research topic that I'm planning now to concentrate on.
An aspect of it is discussed in the proposal that I submitted to the NSF
last year (still pending), entitled "General Purpose Database of Knowledge
about Actions".  Here is a brief summary:

  The objective of this project is to build a general purpose database of
  commonsense facts about actions expressed in the language of CCalc.  This
  language has been used to formalize many small action domains ("toy
  worlds").  On the basis of this work it's possible now to isolate the
  concepts and principles that these and similar individual domains have in
  common and to build a database of such principles stated in a general
  form.  Once such a database is created, it will be possible to obtain
  from it formalizations of many individual action domains, including those
  studied in the past, by adding domain-specific facts.

  The database will include general assumptions about the effects, both
  direct and indirect, of actions of various kinds, about the executability
  of actions, and about agents that execute them.  Some of these facts have
  to do with objects that change their locations and internal states.  One
  useful assumption, for instance, is that normally an agent can execute an
  action affecting the location or state of an object only if the agent is
  located next to the object.  According to another assumption to be
  included in the database, objects are typically supported by something--
  by horizontal surfaces of some kind, or by other objects supported by
  such a surface.  Moving an object indirectly affects the locations of the
  things that are supported by that object.  The database will also contain
  commonsense facts about creating and destroying objects, about animals
  and humans, about buying and selling, about actions related to knowledge
  and memory, and so forth.

===========

Date: Mon, 18 Oct 2004
From: Vladimir Lifschitz
To: John McCarthy

> As many of you know, I have been working on these issues since 1958.

Among the various things that we have learned from your papers, three seem
particularly relevant when we think about formalizing backgroung knowledge.

1. The idea of generality in AI:

  "...no-one knows how to make a general database of common sense knowledge
  that could be used by any program that needed the knowledge. Along with
  other information, such a database would contain what a robot would need
  to know  about the effects of moving objects around, what a person can be
  expected to know about his family, and the facts about buying and selling.
  This doesn't depend on whether the knowledge is to be expressed in a
  logical language or in some other formalism...  In my opinion, getting a
  language for  expressing general common sense knowledge for inclusion in
  a general database is the key problem of generality in AI."

2. Nonmonotonic reasoning.  To make the database of background knowledge
truly general, we need to state every postulate as a default.  We should
be always ready for exceptions.  Take the example from my previous message:
"normally an agent can execute an action affecting the location or state of
an object only if the agent is located next to the object."  Using a remote
control is an exception, or sending an e-mail.  I expect an abnormality
predicate pretty much in every line of the database.

3. Contexts.  The database of background knowledge and formal descriptions
of specific domains will usually express the same idea using different
vocabularies, and we need to "bridge" them to make them work together.
What is called "location" in the general database may be called "city" in
a travel module and "bank" when we reason about missionaries and cannibals.
In simple cases, a simple string substitution may be sufficient, but
sometimes we'll need rather sophisticated translation procedures.

===========

Date: Mon, 18 Oct 2004
From: Peter Clark
To: TAG

Very interesting discussion! I agree these are fundamental issues that 
merit a lot of attention.

On the challenge of going from English sentences to a formalization, there 
was some interesting work done in the 70's on answering physics questions
expressed in English, requiring addressing this mapping from English to logic, 
in particular ISSAC (Gordon Novak, UT Austin) and MECHO (Alan Bundy, 
Edinburgh). ISSAC is well documented at 
http://www.cs.utexas.edu/users/novak/physics.html  e.g., see 
http://www.cs.utexas.edu/users/novak/ijcai77.html
We at Boeing are planning something similar in an ongoing project 
(Project Halo, mentioned below) in collaboration with UT Austin and SRI
International. 

On repositories of background knowledge, another interesting resource is
the Component Library (CLib) from Bruce Porter's group at UT Austin, at
http://www.cs.utexas.edu/users/mfkb/RKF/clib.html

===========

Date: Tue, 19 Oct 2004
From: Bruce Porter
To: TAG

As a starting point on building a database of formal representations
of actions (and events and roles, as well), you might be interested in
reviewing the "component library" that we've been building for several
years. Please see: 
   http://www.cs.utexas.edu/users/mfkb/RKF/clib.html

The component library contains KM representations (along with
documentation) of commonsense ...
  - actions, such as move, enter, carry, penetrate
  - states, such as be-contained, be-attached
  - roles, such as vehicle, employee
  - relations, such as case roles
  - properties, such as length, intensity

We've found that the component library can be used (even by non
AI'ers) to build functional knowledge systems in specific
domains. Recently, we used it to help build the winning knowledge
system for Project Halo, Paul Allen's AP-Chemistry challenge, see:
   http://www.cs.utexas.edu/users/mfkb/RKF/projects/halo.html 

===========

Date: 21 Oct 2004
From: Vladimir Lifschitz
To: Bruce Porter and Peter Clark

Thank you for the useful references.  We are trying now to develop a
completely abstract description of the move action in action language
C+ and to find a way to describe "bridges" between that description and
properties of moving objects in specific domains.  Even a brief look at
http://www.cs.utexas.edu/users/mfkb/RKF/trunktree/km/Move.km.html was
useful to us: it suggested a better of choice of terminology than what
we originally came up with.

On a deeper level, I'm wondering whether KM representations can be used
to answer queries of the kind Michael deals with in his Travel Module
(http://www.cs.utexas.edu/tag/travelmodule).  If not, is it because of
some difference in the goals of the two projects?

===========

Date: Wed, 3 Nov 2004
From: Vladimir Lifschitz
To: Michael Gelfond

We (TAG at Austin) started discussing your draft on the commonsense theory
of travel (http://www.cs.utexas.edu/tag/travelmodule) from the perspective
of our earlier exchange of messages on the problem of formalizing background
knowledge (http://www.cs.utexas.edu/tag/rightthing).  We have only covered
a small part of your draft so far, but we already have a few questions and
comments.  Here is one.

You propose this postulate about cities, contries, and unions (such as EU):

in(C,Union) :- in(C,Country),
	       in(Country,Union).

It seems to me that this rule cannot be really counted as a formalization
of our background knowledge, because it's much too specialized.  What we
do have in our background is the transitivity of the containment relation
between locations.  An attempt to formalize this (in the language of CCalc)
was done in our paper "Getting to the Airport: the oldest planning problem
in AI":

:- variables
  X,Y,Z                    :: object;
   
...

:- constants
  at(object,object)        :: inertialFluent;

...

caused  at(X,Z) if at(X,Y) & at(Y,Z).

(see http://www.cs.utexas.edu/users/tag/cc/examples/airport/airport-domain).

Note that we treated at(X,Y) as a fluent, which makes the postulate more
general.  If you do this, you may be able to treat in(C,Country) and the
fluent inside(B,Container) that you introduce later as instances of the
same relation, what do you think?

Treating in(Paris,France) as a fluent that happens not to change its value
no matter what actions we perform may seem strange.  But this doesn't
hurt.  I remember that many years ago I saw John McCarthy's draft in which
he asked whether at(Stanford,California) should be viewed as a fluent.  He
said, what if the Board of Regents will decide to move Stanford to another
place?

We also wanted to ask you about this part of your draft:

> %%%%%% Default Initial Values for Some Fluents %%%%%%%
> 
> % We can assume that in the begining of the story
> % the traveler already has his passport, 
> % and his luggage.
> 
> h(has(P,passport(P)),0) :-
>       not -h(has(P,passport(P)),0).
> 
> h(has(P,Luggage),0) :-
>         owns(P,Luggage),
>         not -h(has(P,Luggage),0).      
> 
> % If in the initial situation P goes on journey J
> % then, in the absence of contrary evidence, we  
> % assume that  J and P are at the origin of J.
> % o(A,T) says that "action A occurs at time step T".
> 
> h(at(J,C),0) :-
>       o(go_on(P,J),0),
>       origin(J,C),
>       not -h(at(J,C),0).
> 
> h(at(P,C),0) :-
>       o(go_on(P,J),0),
>       origin(J,C),
>       not -h(at(P,C),0).

I am not sure what the status of these rules is.  They don't look like
parts of our background knowledge about the world, or do they?  They are
assumptions about the kind of states that we are willing to denote by 0
in specific scenarios.  It's not clear where they belong in our scheme of
KR as "background + translation."

===========

Date: Fri, 5 Nov 2004
From: Michael Gelfond
To: Vladimir Lifschitz

Below are Vladimir's comments and my answers.

VLADIMIR:
You propose this postulate about cities, countries, and unions (such as
EU):

in(C,Union) :- in(C,Country),
	       in(Country,Union).

It seems to me that this rule cannot be really counted as a
formalization of our background knowledge, because it's much too
specialized.  What we do have in our background is the transitivity
of the containment relation between locations.
An attempt to formalize this (in the language of
CCalc) was done in our paper "Getting to the Airport: the oldest 
planning
problem
in AI":

:- variables
    X,Y,Z                    :: object;

...

:- constants
    at(object,object)        :: inertialFluent;

...

caused  at(X,Z) if at(X,Y) & at(Y,Z).

(see http://www.cs.utexas.edu/users/tag/cc/examples/airport/airport-domain).

Note that we treated at(X,Y) as a fluent, which makes the postulate more
general.  If you do this, you may be able to treat in(C,Country) and the
fluent inside(B,Container) that you introduce later as instances of the
same relation, what do you think?

MICHAEL.
I agree that the definition of the relation "in", including
the UNION axiom you mentioned above is not good.
It was not intended to be. These are just little pieces used
to test the travel module. (Initially we only had a few atoms.
The union axiom was added later to test exceptions to some
defaults about need of passports).

In general this information should come from another
module which we tentatively call Geography.
Intuitively this module (among other things) contains a hierarchy
of "places" or "regions". The hierarchy is an acyclic directed
graph. It can be represented, say, by atoms

link(paris,france).
link(france,western_europe).
link(france,eu).
link(western_europe,europe).

The "in" relation is simply a transitive closure of "link".

in(Place1,Place3) :-
                link(Place1,Place2),
                in(Place2,Place3).

This seems like a more reasonable piece of background geographical
knowledge than the original one.

You seem to be looking for a substantially more general
notion of "in" which you call "at". Right?

I am not sure if it  subsumes our "at" or not.
Intuitively you define at(L1,L2) as a "containment relation
between locations". So our at(person,location) seems to be
different from yours. On the other hand in the declaration above
parameters of your "at" are objects. So i am not sure.

In general it is very interesting to see if such a generalization
will be useful for our task. I do not know the answer to
this question yet. Any suggestion on how we can check that?

VLADIMIR:
Treating in(Paris,France) as a fluent that happens not to change its
value no matter what actions we perform may seem strange.
But this doesn't hurt.  I remember that many years ago I saw
John McCarthy's draft in which he asked whether at(Stanford,California)
should be viewed as a fluent.  He said, what if the Board of Regents
will decide to move Stanford to another place?


MICHAEL:
I agree that it does not hurt (and actually helps) if your
goal is to write really general axioms. This is a very
interesting goal.
But we are using our travel module as a part of program
which is supposed to answer our questions in a reasonable
amount of time. I am pretty sure that making "in"
a fluent will increase this time.
Deciding what theoretical possibilities shall or shall not
be included in your module is task dependent.
A knowledge engineer should make these decisions based on
the use of his system.


VLADIMIR:
We also wanted to ask you about this part of your draft:

 > %%%%%% Default Initial Values for Some Fluents %%%%%%%
 >
 > % We can assume that in the beginning of the story
 > % the traveler already has his passport,
 > % and his luggage.
 >
 > h(has(P,passport(P)),0) :-
 >       not -h(has(P,passport(P)),0).
 >
 > h(has(P,Luggage),0) :-
 >         owns(P,Luggage),
 >         not -h(has(P,Luggage),0).
 >
 > % If in the initial situation P goes on journey J
 > % then, in the absence of contrary evidence, we
 > % assume that  J and P are at the origin of J.
 > % o(A,T) says that "action A occurs at time step T".
 >
 > h(at(J,C),0) :-
 >       o(go_on(P,J),0),
 >       origin(J,C),
 >       not -h(at(J,C),0).
 >
 > h(at(P,C),0) :-
 >       o(go_on(P,J),0),
 >       origin(J,C),
 >       not -h(at(P,C),0).

I am not sure what the status of these rules is.  They don't look like
parts of our background knowledge about the world, or do they?  They are
assumptions about the kind of states that we are willing to denote by 0
in specific scenarios.  It's not clear where they belong in our scheme
of KR as "background + translation."


MICHAEL:
You are right.
They do not really belong to the TRAVEL MODULE proper (very much
like the "geography" does not).

TRAVEL MODULE describes possible trajectories
of the domain. In terms of action languages it is an action
description (signature plus the laws).

It is normally used in conjunction with a history -
collection of observations of occurrences of actions and
truth values of fluents. (Our use of the term is somewhat
different from that used in Austin).

Part of this history is supplied by the content of our stories. 
(Translation).
But another part consists of some communication agreements
which help us to understand the stories. The above defaults belong
to this category.  As you say "they are assumptions about the
kind of states that we are willing to denote by 0" in our translations.
They are less general then those used in travel module proper.
If someone decides to use this module for, say, planning
the trip he may describe an initial situation in
a different way.

It is still background knowledge + translation though.
Just a background knowledge of different type.

===========

Date: Fri, 5 Nov 2004
From: Vladimir Lifschitz
To: Michael Gelfond

The paper on the airport example is available at

  http://www.cs.utexas.edu/users/vl/papers/airport.ps

and Sec. 2.1, entitled "Transitivity of 'at'," is relevant to this
discussion.

> I am not sure if it  subsumes our "at" or not.
> Intuitively you define at(L1,L2) as a "containment relation
> between locations". So our at(person,location) seems to be
> different from yours. On the other hand in the declaration above
> parameters of your "at" are objects. So i am not sure.

As far as I remember, that paper (as well as its predecessor, McCarthy's
paper "Programs with Common Sense") doesn't distinguish between an object
and the part of space that the object occupies.  We can think of a
suitcase (or a building, or a city, or a continent) both as an object and
as a part of space.

===========

Date: Mon, 8 Nov 2004
From: Vladimir Lifschitz
To: Michael Gelfond

> if your goal is to write really general axioms. This is a very
> interesting goal.

Yes, I am interested in describing commonsense knowledge by axioms that are
maximally general.  This is similar to Bourbaki's approach to describing
mathematical knowledge: every fact should be stated in the maximally
abstract form, with all irrelevant details dropped.  For instance, the
binomial theorem should be stated as a theorem about commutative rings,
because the only properties of numbers you need to know to prove that
theorem are the axioms for commutative rings.  The fact that you can divide
numbers, or compare them, or compute limits of numerical sequences--all that
is irrelevant as far as the binomial theorem goes.  My suggestion that you
modify the rule

    in(C,Union) :- in(C,Country), in(Country,Union)

is related to the fact that it's not "Bourbaki-style."

> But we are using our travel module as a part of program
> which is supposed to answer our questions in a reasonable
> amount of time. I am pretty sure that making "in"
> a fluent will increase this time.

A few comments:

1. I agree about the computation time.  On the other hand, computation time
is not the only measure of the quality of a formalization.  We can also
ask how economical and elegant the axiom set is, and here generality may be
our ally.  Consider, for instance, the Monkey and Bananas domain, which
involves four actions: Walk, PushBox, Climb and GraspBananas.  We may be able
to treat the first three as instances of the same action--moving an object.
In PushBox, the monkey moves the box; in Walk and Climb, he moves himself.
In Walk and PushBox, motion is horizontal; in Climb, it's vertical.  But
otherwise we are talking about the same action, in the abstract sense. Isn't
this interesting?

2. I remember the old time when formal commonsense reasoning and automated
reasoning were two almost disjoint research areas, because no one knew how
to automate anything nonmonotonic.  There was no ASP.  There was Prolog, of
course,  but we didn't know much about its relation to formal theories of
nonmonotonic reasoning. We are much happier now, because we can experiment
with answer set solvers.  Moreover, ASP has real applications today!  But
we pay a price for all this: we are tempted to think in terms of computation
time, instead of mathematical elegance.  The Good Book says: where your
treasure is, there will your heart be also.

3. We may be able to reduce computation time without losing generality if
we make our query evaluation procedures more intelligent.  The first step
towards that goal will be to design a maximally general formalization that
treats "in" as a fluent. After that, we'll ask ourselves: how can an
evaluator exploit the fact that in some cases this fluent is rigid (that
is to say, doesn't change its values with time) to decrease the computation
time?