Medium Size ASP Programs

Date: Wed, 23 Nov 2005
From: Claudia Zepeda
To: TAG

I am interested in testing some naive answer set planning encodings using a
background knowledge of "medium size" (related with a real application).
However, sometimes memory errors occur when I run DLV or Smodels to obtain
the answer sets for these encodings. Additionally, sometimes it takes a long
time to obtain the answer sets. I am conscious that the execution of large
programs using answer set solvers is one of the challenges faced today.
However, according to your experience using these solvers, please, could you
tell me if you know some optimisation techniques or methods that help to
reduce in part the time and/or the memory requirements for DLV or Smoldels
to obtain the answer sets? I found that one of the ways to improve the
performance of the answer sets solvers could be the use of "parallelism",
do you know if there is some prototype of this kind of systems available? 

With the only intention to illustrate one of the problems that I am
mentioning, I present one of the naive encodings for one of the problems
that I am dealing: 

Let us consider a planning problem where given a graph, a subset of nodes
corresponds to the initial positions for a set of mobiles and a subset of
nodes corresponds to the final positions of these mobiles. The only one
action is "travel" that allows a mobile travel by an arc. The action travel
takes one unit of time. Each arc has a capacity indicating the number of
mobiles that can travel by this arc per unit of time. The action travel can
be concurrent, i.e., different mobiles can travel by the arcs at the same
unit of time. Then the plans indicate the sequence of arcs that each mobile
should travel in order to arrive to its final position. I include the naive
answer set encoding (in Smodels) at the end of this message. I started
testing this encoding with 5, 6, 7, 8, 9 and 10 mobiles where all of them
have the same initial and final (goal) position and I get a set of possible
plans. However, when I added one more mobile, i.e., 11 mobiles at the same
initial position that have to arrive at the same final position, Smodels
indicates a memory problem. Specifically, the problem is:


smodels version 2.26. Reading...couldn't allocate necessary memory
head atom out of bounds, line 3793447
done
Error in input

I am using a Computer intel Pentium 4 CPU 2.4GHz, 480MB of RAM with system
Microsoft Windows XP, Professional Version 2002. 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% lparse mobiles.sm | smodels 0

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

const length=16.
time(1..length).
node(1..6).
capacity(1..3).
 
mobile(b1). mobile(b2). mobile(b3). mobile(b4). mobile(b5).
mobile(b6). mobile(b7). mobile(b8). mobile(b9). mobile(b10).
mobile(b11).

%% arc(position,position,capacity)
arc(1,2,3). arc(2,3,2). arc(2,3,1). arc(2,4,1). 
arc(3,5,2). arc(4,5,1). arc(5,6,1).

%%Initial position of each mobile (at time 1) 
%%holds(at(mobil,position), time).
holds(at(b1,1),1).
holds(at(b2,1),1).
holds(at(b3,1),1).
holds(at(b4,1),1).
holds(at(b5,1),1).
holds(at(b6,1),1).
holds(at(b7,1),1).
holds(at(b8,1),1).
holds(at(b9,1),1).
holds(at(b10,1),1).
holds(at(b11,1),1). 
 
%%goal
goal:- holds(at(b1,6),T),
holds(at(b2,6),T), 
holds(at(b3,6),T), 
holds(at(b4,6),T), 
holds(at(b5,6),T), 
holds(at(b6,6),T), 
holds(at(b7,6),T), 
holds(at(b8,6),T), 
holds(at(b9,6),T), 
holds(at(b10,6),T), 
holds(at(b11,6),T),
T<=length, 
time(T). 
 
:- not goal. 
 
%fluent at(mobile,position) "B at X" 
 
 
fluent(at(B,X)):- mobile(B), node(X).
 
% subaction travel allows to travel 
% from X to Y by an arc with capacity C 
 
sub(travel(X,Y,C)):-
arc(X,Y,C), node(X), node(Y), capacity(C). 
 
% action travel allows movil B travel from X to Y 
% by an arc with capacity C
 
action(acc(B,travel(X,Y,C))):- 
mobile(B), arc(X,Y,C). 
 
 
occurs(acc(B,S),T):-
T<=length,
not not_occurs(acc(B,S),T), 
mobile(B), sub(S), time(T).
 
% If B travel from X to Y then B at Y 
 
holds(at(B,Y),T+1):-
not not_holds(at(B,Y),T+1), 
occurs(acc(B,travel(X,Y,C)),T), time(T), 
mobile(B), capacity(C), 
node(X), node(Y), neq(X,Y). 
 
ab(at(B,Y),acc(B,travel(X,Y,C)),T):-
occurs(acc(B,travel(X,Y,C)),T), time(T), 
mobile(B), capacity(C), 
node(X), node(Y), neq(X,Y). 
 
% If B travel from X to Y then B not at X 
 
not_holds(at(B,X),T+1):-
occurs(acc(B,travel(X,Y,C)),T), time(T), 
mobile(B), capacity(C), 
node(X), node(Y), neq(X,Y). 
 
ab(at(B,X),acc(B,travel(X,Y,C)),T):-
occurs(acc(B,travel(X,Y,C)),T), time(T), 
mobile(B), capacity(C), 
node(X), node(Y), neq(X,Y). 
 
%% If B is nota t X then it can travel from X to Y. 
 
not_occurs(acc(B,travel(X,Y,C)),T):- 
not holds(at(B,X),T), 
mobile(B), node(X), node(Y),
capacity(C), time(T).
 
%% If the arc capacity is 1 then only one mobile 
% can travel by it at most per unit of time. 
 
not_occurs(acc(B1,travel(X,Y,1)),T):-
occurs(acc(B2,travel(X,Y,1)),T),
mobile(B1), mobile(B2), neq(B1,B2),
node(X), node(Y), time(T). 
 
%% If the arc capacity is 2 then 
% 2 mobiles can travel by it at most. 
 
not_occurs(acc(B1,travel(X,Y,2)),T):-
occurs(acc(B2,travel(X,Y,2)),T),
occurs(acc(B3,travel(X,Y,2)),T), 
mobile(B1), mobile(B2), mobile(B3), 
neq(B1,B2), neq(B1,B3), neq(B2,B3), 
node(X), node(Y),time(T). 
 
%% If the arc capacity is 3 then 
% three mobiles can travel by it at most per unit of time. 
 
not_occurs(acc(B1,travel(X,Y,3)),T):-
occurs(acc(B2,travel(X,Y,3)),T),
occurs(acc(B3,travel(X,Y,3)),T), 
occurs(acc(B4,travel(X,Y,3)),T), 
mobile(B1), mobile(B2), mobile(B3), mobile(B4), 
neq(B1,B2), neq(B1,B3), neq(B1,B4), 
neq(B2,B3), neq(B2,B4), neq(B3,B4), 
node(X), node(Y), time(T). 
 
%%inertia
holds(F,T+1):-
occurs(A,T), 
holds(F,T), not ab(F,A,T), 
not not_holds(F,T+1), 
fluent(F), action(A), time(T).
 
not_holds(F,T+1):-
occurs(A,T), 
not_holds(F,T), not ab(F,A,T), 
fluent(F), action(A), time(T).
 
%%Concurrency: a mobile cannot travel by two different 
% arcs at the same time.
 
not_occurs(acc(B,S1),T):-
occurs(acc(B,S2),T), 
mobile(B), time(T),
sub(S1), sub(S2), neq(S1,S2). 
 
 
hide mobile(B).
hide node(X).
hide capacity(C).
hide time(T).
hide fluent(F).
hide action(A).
hide arc(X,Y,C).
hide holds(F,T).
hide ab(F,A,T).
hide not_holds(F,T).
hide not_occurs(A,T).
hide sub(S).
hide goal.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

===========

Date: Wed, 23 Nov 2005
From: Yuting Zhao
To: Claudia Zepeda

It could be better if you run under Unix or Linux system. You may notice
that most of the benchmark testing are under Unix or Linux.

===========

Date: Sun, 4 Dec 2005
From: Alessandro Provetti
To: Claudia Zepeda

	I went over your program and it seems to me that the potential
problem is the time parameter set to 16, i.e., a high value that 
implies a 'monster' instantiation.  Is such an high value really needed?
Of course, this is not a general solution.

As per 

> I am interested in testing some naive answer set planning encodings using a
> background knowledge of "medium size" (related with a real application).
> However, sometimes memory errors occur when I run DLV or Smodels to obtain
> the answer sets for these encodings. Additionally, sometimes it takes a long
> time to obtain the answer sets. I am conscious that the execution of large
> programs using answer set solvers is one of the challenges faced today.
> However, according to your experience using these solvers, please,

In our article at MICAI 2000 (should be on the Web) we investigate the 
technique called linearization to reduce the size of the ground version
of the planner.  Linearization, unfortunately, is applicable, I believe,
only in a  single-action scenario.  However, in that scenario it does work
pretty well.

> could you tell me if you know some optimisation techniques or methods 
> that help to reduce in part the time and/or the memory requirements 
> for DLV or Smoldels to obtain the answer sets? I found that one of the
> ways to improve the performance of the answer sets solvers could be the
> use of "parallelism", do you know if there is some prototype of this kind
> of systems available? 

I think I remember that Gelfond et al. did some work in that direction, 
but I don't know about the latest results.

===========

Date: Tue, 14 Dec 2005
From: Martin Brain
To: Claudia Zepeda

> I am interested in testing some naive answer set planning encodings using a
> background knowledge of "medium size" (related with a real application).
> However, sometimes memory errors occur when I run DLV or Smodels to obtain
> the answer sets for these encodings.

These two are separate but related problems.  Memory errors occur
because the grounding is too large to be structured in memory.  The
`solution' is to work on a solver that grounds dynamically as it solves.
To the best of my knowledge no such tool exists although myself and a
number of others are working on designs.  Possible work arounds fall
into two categories - alter the solver or alter the program.  If the
memory errors are being given because the solver is running out of
address space, then moving it to a 64 bit platform *may* help (see below
for notes on swapping though), but I don't think it's the way you want
to go.  Other limitations in the solver (such as choice of data
structures, etc.) could also be removed but this would be non-trivial
unless you already know the solver internals.  In terms of altering the
program, start very small and then as you work up, check the size of the
grounding produced.  For example (as pointed out be Alessandro) starting
with time less than 16, fewer nodes, less mobile objects, capacity 2,
etc. should allow you to see which parameter effects the size of the
grounding most, from that it should be possible to work out which rules
are having the biggest impact and if needed rephrase them.

The amount of time taken to compute the answer sets is the product of a
number of things.  If the solver allocates more memory than the machine
has physical RAM, swapping is likely to destroy your performance; either
reduce the size of the ground or use a machine with more RAM.  Beyond
that the time taken is dictated by the search space involved - and how
much of it you are searching.  Are you requesting all of the answer sets
or just one?  You may find that trying different solvers gives
substantially different results as some handle certain ideoms better
than others.

> I am conscious that the execution of large
> programs using answer set solvers is one of the challenges faced today.
> However, according to your experience using these solvers, please, could you
> tell me if you know some optimisation techniques or methods that help to
> reduce in part the time and/or the memory requirements for DLV or Smoldels
> to obtain the answer sets? I found that one of the ways to improve the
> performance of the answer sets solvers could be the use of "parallelism",
> do you know if there is some prototype of this kind of systems available? 

Platypus http://www.cs.uni-potsdam.de/platypus/ is the only publically
available, open source parallel solver.  Romeo
http://www.krlab.cs.ttu.edu///Software/ is listed as available on
request from the author and I believe there is a parallel back end for
DLV.  To the best of my knowledge all of these systems work by
distributing the computation of partial interpretations, thus giving a
potential N fold speed up over N nodes (althought in reality the speed
up seems to vary between N/2 and N/1.5 ish, N/1.1 is apparently
possible).  These will help if the problem is with the size of the
search space, however altering the program is likely to be easier and
more efficient.

===========

Date: Sun, 18 Dec 2005
From: Gerald Pfeifer
To: TAG

> > I am interested in testing some naive answer set planning encodings using a
> > background knowledge of "medium size" (related with a real application).
> > However, sometimes memory errors occur when I run DLV or Smodels to obtain
> > the answer sets for these encodings.
> 
> These two are separate but related problems.  Memory errors occur
> because the grounding is too large to be structured in memory.  The
> `solution' is to work on a solver that grounds dynamically as it solves.

That's one approach.  Another one is to improve grounders to generate 
smaller residual representations of the original, non-ground input program.

Both Lparse and DLV do not generate the full, theoretical grounding in
general, and especially the latter has several optimizations/features to 
improve further upon that.

> If the memory errors are being given because the solver is running out 
> of address space, then moving it to a 64 bit platform *may* help (see 
> below for notes on swapping though), but I don't think it's the way you 
> want to go.

If it's a real scalability problem, I agree that simply throwing hardware
at it won't buy us much.

I've got access to several machines with 32 GB, 64 GB, and beyond, and 
would be happy to run some tests, but if the grounding already runs into 
the 4 GB barrier, the latter stages of the solver probably won't terminate
in any reasonable amount of time.

> Other limitations in the solver (such as choice of data structures, 
> etc.) could also be removed but this would be non-trivial unless you 
> already know the solver internals.

As far as I know Smodels and DLV, at least, there isn't much room for 
improvement left as far as space consumption by the core engine is 
concerned.

> Platypus http://www.cs.uni-potsdam.de/platypus/ is the only publically
> available, open source parallel solver.  Romeo
> http://www.krlab.cs.ttu.edu///Software/ is listed as available on
> request from the author and I believe there is a parallel back end for
> DLV.  To the best of my knowledge all of these systems work by
> distributing the computation of partial interpretations, thus giving a
> potential N fold speed up over N nodes (althought in reality the speed
> up seems to vary between N/2 and N/1.5 ish, N/1.1 is apparently
> possible).

That sounds very promising!  You made me curious, so I wanted to give
platypus a try, but failed to build on my Itanium2 servers running SUSE
Linux Enterprise Server and GCC 4.1:

  mkdir .libs
  g++ -DHAVE_CONFIG_H -I. -I. -I../config -Wall -D_REENTRANT 
  -DPT_NATIVE_WORD=32 -DPT_LINUX -DPT_FAST_MALLOC -I../include -m32 
  -pthread -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c 
  ./barrier.cpp -fPIC -DPIC -o .libs/barrier.o
  cc1plus: error: unrecognized command line option "-m32"

Why does Platypus force PT_NATIVE_WORD=32 and run -m32 on a true 64-bit 
architecture?  Hmm, I see, portablethreads/README only indicates support 
for x86, x86-64, and sparc.

I'm afraid failure to build on Itanium means we won't be able to do 
scalability testing on larger machines, but let's try SUSE Linux 10.0
on x86 with GCC 4.0, perhaps I can locate a decently sized machine some-
where.

  In file included from ../include/portablethreads/barrier.h:20,
                   from ./barrier.cpp:20:
  ../include/portablethreads/config.h:21:1: warning: "PT_NATIVE_WORD"
  redefined <command line>:1:1: warning: this is the location of the previous
  definition ../include/portablethreads/arch/x86-64-gcc.h: In function
  'PortableThreads::uint64 PortableThreads::LockFree::Private::pt_ticks()':
  ../include/portablethreads/arch/x86-64-gcc.h:164: warning: left shift count
  >= width of type ../include/portablethreads/arch/native-atomic-number.h:
  In member function 'PortableThreads::pt_int_type PortableThreads::LockFree
  ::ArchSpecific::PTAtomicNumber::dec(PortableThreads::pt_int_type)':

  ...

Does anyone know who to fix these issues?

===========

Date: Sun, 18 Dec 2005
From: Martin Brain
To: Gerald Pfeifer

> That's one approach.  Another one is to improve grounders to generate 
> smaller residual representations of the original, non-ground input 
> program.
> 
> Both Lparse and DLV do not generate the full, theoretical grounding in
> general, and especially the latter has several optimizations/features to 
> improve further upon that.

My apologies, I should have said that in my view dynamic grounding is
the limit of this process.


> > Other limitations in the solver (such as choice of data structures, 
> > etc.) could also be removed but this would be non-trivial unless you 
> > already know the solver internals.
> 
> As far as I know Smodels and DLV, at least, there isn't much room for 
> improvement left as far as space consumption by the core engine is 
> concerned.

Again, I should have been clearer.  This approach may make a small
saving but as with moving to larger address spaces these are small hacks
and won't solve the underlying problem.

> > Platypus http://www.cs.uni-potsdam.de/platypus/ is the only publically
> > available, open source parallel solver.  Romeo
> > http://www.krlab.cs.ttu.edu///Software/ is listed as available on
> > request from the author and I believe there is a parallel back end for
> > DLV.  To the best of my knowledge all of these systems work by
> > distributing the computation of partial interpretations, thus giving a
> > potential N fold speed up over N nodes (althought in reality the speed
> > up seems to vary between N/2 and N/1.5 ish, N/1.1 is apparently
> > possible).
> 
> That sounds very promising!  You made me curious, so I wanted to give
> platypus a try, but failed to build on my Itanium2 servers running SUSE
> Linux Enterprise Server and GCC 4.1:
> 
>   mkdir .libs
>   g++ -DHAVE_CONFIG_H -I. -I. -I../config -Wall -D_REENTRANT 
>   -DPT_NATIVE_WORD=32 -DPT_LINUX -DPT_FAST_MALLOC -I../include -m32 
>   -pthread -g -O2 -MT barrier.lo -MD -MP -MF .deps/barrier.Tpo -c 
>   ./barrier.cpp -fPIC -DPIC -o .libs/barrier.o
>   cc1plus: error: unrecognized command line option "-m32"
> 
> Why does Platypus force PT_NATIVE_WORD=32 and run -m32 on a true 64-bit 
> architecture?  Hmm, I see, portablethreads/README only indicates support 
> for x86, x86-64, and sparc.

AFAIK it hasn't been ported to IA64 yet.

> Does anyone know who to fix these issues?

I believe PortableThreads is the work of Jean Gressmann
( jsg@rz.uni-potsdam.de ).  The rest of the Platypus team are listed
here:
http://www.cs.uni-potsdam.de/platypus/index.php?page=download

Every time I've mailed them with bugs / queries they've been most
helpful.

===========

Date: Sun, 18 Dec 2005
From: Gerald Pfeifer
To: Martin Brain

> Again, I should have been clearer.  This approach may make a small
> saving but as with moving to larger address spaces these are small hacks
> and won't solve the underlying problem.

It seems we are in full agreement here; sorry if that didn't come across
properly from my side.

(Back in 1997 and a bit later I do remember making changes to DLV which 
brought down many operations from O(log n) to O(1) and from O(n�) to O(n),
but I don't think there are any such cases left in either Smodels or DLV,
that's what I ment to indicate with my comment.)

> I believe PortableThreads is the work of Jean Gressmann ( 
> jsg@rz.uni-potsdam.de ).  The rest of the Platypus team are listed here: 
> http://www.cs.uni-potsdam.de/platypus/index.php?page=download
> 
> Every time I've mailed them with bugs / queries they've been most 
> helpful.

I provided my findings to them as well, and will happily report my 
experiences to the tag@lists.cc.utexas.edu list once I've got a version
of Platypus I can get to run on some of my big iron.  I'm quite curious!

===========

Date: Mon, 19 Dec 2005
From: Wolfgang Faber
To: Martin Brain

> I believe there is a parallel back end for DLV.

I'm afraid there is no such thing.

Date: Wed, 19 Dec 2005
From: Claudia Zepeda
To: TAG

Thanks a lot for your comments, suggestions and references.  I really 
appreciate all of them, I have started to test and use them.  
 
Only as a comment I would like to mention that I ran under Unix my 
program and it was possible to obtain a plan for 11 and 12 mobiles, i.e., 
I could consider one or two mobiles more using Unix than using Windows.  
I also reduced the time parameter. However, when I consider 10 mobiles,  
it is necessary to set at least the time parameter to 14 in 
order to obtain a plan.  Then I think that it is also a high value. Now, 
I am starting very small and checking the size of the grounding, in order 
to work out which rules are having the biggest impact.