Sparse bitsets library: an alternative to bitsets for representing finite sets of natural numbers.

In the ordinary bitsets library, sets of natural numbers are encoded as bignums. This is perfectly fine when the set elements are relatively small, but it is not an efficient way to deal with sets of large numbers. For instance, trying to insert a number like 2^30 into an ordinary bitset will probably kill your Lisp.

Sparse bitsets are a more forgiving alternative. They can handle very large set elements, but still achieve bitset-like efficiencies in many cases.

Sparse bitsets are represented as ordered lists of

offset*blocksize + 0 \in X iff (logbitp 0 block) offset*blocksize + 1 \in X iff (logbitp 1 block) ... offset*blocksize + (blocksize-1) \in X iff (logbitp (blocksize-1) block)

The

Sparse bitsets are essentially a hybrid of ordered sets and bitsets, and their efficiency characteristics are very data dependent.

Sparse bitsets perform best on sets that have "clusters" of nearby members, perhaps separated by wide gaps. Here, sparse bitsets give you some of the same benefits of bitsets, viz. word-at-a-time operations like union and the space efficiency of using bit masks to represent the set members.

For completely dense sets, e.g., all integers in

- Better than ordered sets, because they get to compress nearby elements into bitmasks and use word-at-a-time operations.
- Somewhat worse than ordinary bitsets. They are less space efficient due to the conses and offset numbers. They are also more expensive for membership testing: regular bitsets can basically use O(1) array indexing via logbitp, while sparse bitsets must use an O(n) search for the right pair.

For "cluster-free" sets where the elements are so far apart that they never fall into the same block, sparse bitsets are:

- Better than bitsets, which perform badly here since they can have large segments of 0 bits that waste space and take time to process during set operations.
- Somewhat worse than ordered sets. For instance, a set like {0, 60, 120}
would be encoded as
((0 . 1) (1 . 1) (2 . 1)) , which is similar to its ordered set representation except for the additional overhead of using(1 . 1) to represent 60, etc.

Convention:

a, b, ... denote set elementsX, Y, ... denote sets.

Valid sparse bitsets are recognized by sbitsetp, and there is a standard fixing function, sbitset-fix.

We intend to eventually implement sparse analogues for all of the functions in the bitsets library, and analogues for the bitset reasoning techniques. For now we have only finished a few of these operations:

(sbitset-singleton a) - Constructs the set
{a} (sbitset-union X Y ...) - Constructs the set
X U Y U ... (sbitset-intersect X Y ...) - Constructs the set
X \intersect Y \intersect ... (sbitset-difference X Y) - Constructs the set
X - Y

- Todo

(sbitset-members X) - Constructs an ordinary ordered set with the elements of X.
- Important in reasoning about sparse bitsets.

- Sbitset-members
`(sbitset-members x)`converts a bitset into an ordinary, ordered set.- Sbitset-pairp
`(sbitset-pairp x)`recognizes a valid(offset . block) pair for sparse bitsets.- Sbitsetp
`(sbitsetp x)`determines whetherX is a well-formed sparse bitset.- Sbitset-difference
(sbitset-difference X Y) constructs the setX - Y .- Sbitset-intersect
(sbitset-intersect X Y ...) constructs the setX \intersect Y \intersect ... .- Sbitset-union
(sbitset-union X Y ...) constructs the setX U Y U ... .- Sbitset-singleton
`(sbitset-singleton a)`constructs the singleton set{a} .- Sbitset-fix
`(sbitset-fix x)`is a basic fixing function for sparse bitsets.