Digital search trees again revisited: the internal path length perspective

Abstract. This paper studies the asymptotics of the variance for the internal path length in a symmetric digital search tree. The problem was open up to now. We prove that for the binary digital search tree the variance is asymptotically equal to $0.26600\dots \cdot N+N\delta(\log_2 N)$ where N is the number of stored records and $\delta(x)$ is a periodic function of mean zero and a very small amplitude. This result implies that the internal path length becomes asymptotically $N\cdot\log_2 N$ {\it with probability one} (i.e. almost surely). In our previous work we have argued that the variance of the internal (external) path length is a good indicator how well the digital trees are balanced. We shall show that the digital search tree is the best balanced digital tree in the sense that a random shape of this tree strongly resembles a shape of a complete tree. Therefore, we conclude that a symmetric digital tree is a good candidate for a dictionary structure, and a typical search time is asymptotically equal to the optimal one for these type of structures. Finally, in order to prove our result we had to solve a number of nontrivial problems concerning analytic continuations and some others of numerical nature. In fact, our results and techniques are motivated by the methodology introduced in an influential paper by Flajolet and Sedgewick.,, (Wojciech Szpankowski),

This paper is available in the Tex, Dvi, and PostScript format.
(Back to List of Papers)