E.W.Dijkstra Archive: Distances from the root in skew trees (EWD 801)

Distances from the root in skew trees

by Edsger W. Dijkstra and C.S. Scholten

Renewed interest in algorithms for sorting in situ raised the question of the average distance from node to root in binary trees other than balanced ones. (Here binary trees are to be understood as rooted trees in which nodes have zero, one, or two sons.)

In particular we consider the infinite sequence of trees T_i (i ≥ 0), in which for some fixed p and q (p > q ≥ 0)

T_i for 0 ≤ i < p are arbitrarily chosen
T_n and T_n+q are the subtrees of T_n+p

With H_i = the number of nodes in T_i, we have

H_n+p = H_n+q + H_n + 1 ;

with G_i = the sum of the distances of te nodes of T_i from its root we have

G_n+p = G_n+q + G_n + H_n+q + H_n.

We are interested in the asymptotic behaviour of G_i / H_i for large i, this ratio being the average distance from the root in T_i.

Without loss of generality we can confine ourselves to the case gcd(p,q) = 1, since in the case gcd(p,q) = g > 1, the sequence T_i consists of an interleaving of g mutually independent sequences.

With A_i defined by A_i = H_i + 1 and B_i defined by B_i = G_i – 2, we derive

(0) A_n+p = A_n+q + A_n
(1) B_n+p = B_n+q + B_n + A_n+p

Equation (1) is not homogeneous in the B's, but by solving it for the A's and substituting them in (0), we get a homogeneous recurrence relation for the B's, the characteristic polynomial of which is the product of (0)'s characteristic polynomial and the characteristic polynomial corresponding to the homogeneous part of (1). We conclude that the characteristic equation for the A_i is

(2) x^p – x^q – 1 = 0

and that that for the B_i is

(3) (x^p – x^q – 1)² = 0

Under the constraints gcd(p,q) = 1 and p > q ≥ 0, (2) enjoys the property of having one positive root, r say, such that r > 1 and all other roots of (2) have a modulus smaller than r. (For a proof of this theorem, see later.)

From this and the theory of linear recurrence relations we conclude

1)	that the leading term of A_i is of the form k · rⁱ for some constant k
2)	that the leading term of B_i is of the form ( l + L·i ) · rⁱ for some constants l and L.

Substituting these leading terms in (1) we get

( l + L·(n+p) ) · r^n+p =
( l + L·(n+q) ) · r^n+q + ( l + L·n ) · rⁿ + k · r^n+p

Since r is a root of (2) this can be reduced to

(4) L / k = r^p / ( p · r^p – q · r^q )

We define the skewness of a binary tree as the ratio (≥ 1) of the numbers of nodes in its two subtrees. For the trees T_i it follows from the leading term of A_i that the asymptotic skewness s is given by s = r^q, whence q = ^rlog s. Remembering that r satisfies (2), we find r^p = s+1, whence p = ^rlog (s+1). Hence (4) can be rewritten as

(5) L / k =
s+1
(s+1) · ^rlog (s+1) – s·^rlog s

Because r > 1 —so that the asymptotic value of G_i / H_i equals that of B_i / A_i—, we conclude that, expressed in r and s, the average distance from the root in T_i is for large i

(s+1) · i
(s+1) · ^rlog (s+1) – s · ^rlog s

Consequently, the average distance from the root in a tree from the sequence T_i with N nodes is

We are left with the obligation to prove that f x = 0 with f x = x^p – x^q – 1 has one positive root r > 1 dominating the others. Since f 0 = –1 and f(+∞) = +∞, f x = 0 has an odd number of positive roots. Because f' x = x^q–1· ( p·x^p–q – q ) we conclude that f' x = 0 has at most 1 positive root; hence f x = 0 has 1 positive root r and, because f 1 = –1, we conclude r > 1. In other words

(6) for x ≥ 0 sgn(f x) = sgn(x–r)

In order to prove dominance of r, we consider a root m·e^i·φ of (2), with m > 0. Consequently

m^p · e^i·p·φ = m^q · e^i·q·φ + 1 ,

from which we derive —by taking absolute values—

m^p < m^q + 1 ∨ (p·φ) mod 2π = (q·φ) mod 2π = 0 .

Since gcd(p,q) = 1, this can be rewritten as m^p – m^q – 1 < 0 ∨ φ = 0 or, in view of (6): m<r ∨ φ = 0. q.e.d.