Structure Sharing and Text Editing

Our structure sharing ideas have an odd connection to modern text editors, specifically Microsoft Word.

Bob Boyer and I were at the University of Edinburgh, Scotland, in the early 1970s working on automatic theorem proving. We programmed in POP-2, a programming language developed there by Robin Popplestone and Rod Burstall. POP-2 was the language supported by the computing platform available to us: an ICL 4130 processor with 64K bytes of 2 microsecond memory shared between 8 interactive jobs controlled by teletype input from the users. File storage was provided on three 4MB disks.

Text editing was accomplished by a software simulated paper tape editor in which text was streamed through a 1 byte buffer and written to the output file. Having both come from MIT where we had grown accustomed to the Teco editor, we were frustrated by this editor and wrote our own editing software. The challenge was how to edit documents that were too big to fit in memory.

The structure sharing ideas from our theorem proving research suggested the solution: represent edited text by pointing to the original text and indicating how it is to be changed.

We implemented this idea in a text editor in Edinburgh in 1972-3. The original text was kept on disk and the edited text was kept in a doubly-linked chain of records in memory. Each record represented some text and the concatenation of that text represented the edited document. A record represented text either by pointing to a delimited region of text on the disk or to a character string of newly typed text in memory.

Using this representation it is easy to implement the basic operations of insert, delete, search, display and save. For example, to insert text at a given point: split the record containing the point and splice in a new record containing the new text.

Suppose you have a file named story on disk and that it contains the 52 bytes

The quick brown fox jumped over the sleeping poodle.
Within the editing software, this is represented by a single record:
Now imagine you want to insert the characters ``lazy '' in front of the word ``sleeping'' (which starts at position 36 in the file). The resulting representation is:
<"lazy ",0,4>

Saving this session produces the file:

The quick brown fox jumped over the lazy sleeping poodle.

Our Edinburgh editor, which was like Teco but with POP-2 as its command language, was called the ``77-Editor'' because it resided on disk track 77. It was used widely in the Machine Intelligence Department at Edinburgh for about 10 years.

This representation has four nice features:

When I joined Xerox PARC in 1974, I presented this idea in a technical meeting. It solved the memory limitations problems confronting Charles Simonyi who was creating the Bravo editor at PARC. Charles called this the piece table representation. When Charles moved to Microsoft, he used the piece table representation when he created Word and it is still in use in Word.

The ``Historical Preface'' of the above 1981 tech report gives a brief history.

[Best Ideas] [Publications] [Research] [Home]