\input amstex
\magnification= \magstep1
\documentstyle{amsppt}
\NoBlackBoxes
\define\({\left(}
\define\){\right)}
\define\loq {\mathop\roman{lq}\nolimits}
\redefine\P{\Bbb P}
\topmatter
\rightheadtext{Maximum Statistics $\ldots$}
\title
maximum statistics of $N$ random variables\\ distributed by the \\
negative binomial distribution
\endtitle
\author Peter J. Grabner and Helmut Prodinger\endauthor
\affil
Department of Mathematics\\
Technical University of Graz, Austria\\
and\\
Department of Algebra and Discrete Mathematics \\
Technical University of Vienna, Austria
\endaffil
\address
\hbox to\hsize{\vbox{\hbox to 5 truecm{P. Grabner\hfill}
\hbox to 5 truecm{Institut f\"ur Mathematik A\hfill}
\hbox to 5 truecm{Steyrergasse 30\hfill}
\hbox to 5 truecm{A--8010 Graz, Austria\hfill}
\hbox to 5 truecm{\hfill}}
\hfill\vbox
{\hbox to 5 truecm{H. Prodinger\hfill}
\hbox to 5 truecm{Institut f\"ur Algebra\hfill}
\hbox to 5 truecm{und Diskrete Mathematik\hfill}
\hbox to 5 truecm{Wiedner Hauptstra\ss{}e 8--10\hfill}
\hbox to 5 truecm{A--1040 Wien, Austria\hfill}}}
\endaddress
\email grabner\@ftug.dnet.tu--graz.ac.at,
proding\@email.tuwien.ac.at
\endemail
\abstract
The lifetime of a player is defined to be the
time where he gets his $b$th
hit, where a hit will occur with probability $p$. We consider the maximum
statistics of $N$ independent players. For $b\ne1$ this is significantly
more difficult than the known instance $b=1$. The average value of this
parameter is given by $\log_QN+(b-1)\log_Q\log_QN+$ smaller order terms,
where $Q=1/(1-p)$.
\endabstract
\thanks
This research was supported by the Austrian-Hungarian cooperation 10U3
\endthanks
\subjclass 05A15\endsubjclass
\endtopmatter
\document
\heading 1. Introduction\endheading
Consider a geometrically distributed random variable $X$, i.e.
$\P\{X=k\}=pq^{k-1}$ ($q=1-p$). If we have $N$ independent copies
$X_1,\dots, X_N$, the {\sl maximum\/} of them is well understood,
compare \cite{7} and \cite{5}.
Now if $X$ is the sum of two independent geometric random variables
(with the same parameter), the corresponding maximum of $N$ such
variables is surprisingly more complicated. In the present note
we will work out this case and indicate how the more general case
of sums of $b$ copies can be handled. Also, we give analytic
indications why the asymptotic evalution of the expectated value
is more complex.
In the sequel, we will use the suggestive notation of a ``player''
who ``dies'' at the second ``hit''. In this context, we will concentrate
on the expected value of the maximum lifetime of $N$ independent
players. To be precise, if a player gets his second hit at time $k$,
we say that his lifetime is $k$. Such a waiting time distribution
is generally known to be a {\sl negative binomial\/} distribution
(cf. \cite{4}).
Now we compute the probability that a particular player has
lifetime $>k$. This is the same as to say that he is still alive
at time $k$, which means that he obtained either zero or one hit.
Hence
$$
\P\{X >k\}=q^k+kpq^{k-1}.
$$
Thus we have for the expectation of the maximum lifetime of $N$ players
$$\Bbb E_N=\Bbb
E\max(X_1,\ldots,X_N)=\sum_{k\geq0}\(1-\(1-(pk+q)q^{k-1}\)^N\).\tag1.1$$
The usual procedure for the analysis of such sums would proceed as follows:
Using the binomial theorem we would obtain
$$\Bbb E_N=\sum_{m=1}^N(-1)^{m-1}\binom
Nm\sum_{k\geq0}\((pk+q)q^{k-1}\)^m.\tag1.2$$
Then we would have to write this sum as a contour integral (cf. \cite{3}).
But following Rice's method would require deformation of the contour of
integration
and therefore analytic continuation of the function
$$f(s)=\sum_{k\geq0}\((pk+q)q^{k-1}\)^s\tag1.3$$
beyond the line $\Re s=0$.
We tried hard to get such continuation but did not succeed. We will explain
after
the analysis of $\Bbb E_N$, why this is so difficult.
\heading 2. Asymptotic Evaluation of the Sum\endheading
As we have indicated in the introduction the asymptotic expansion of $\Bbb E_N$
cannot be obtained by standard methods. Further analysis can be done by using
some
approximation techniques. Our first (``exponential'') approximation will give
us a more suitable form of $\Bbb E_N$. In the following we will use the
notations
$\loq x$ to indicate $\log_{\frac1q}x$ and $Q=\frac1q$.
Let us define
$$\tilde\Bbb E_N=\sum_{k\geq0}\Big(1-\exp\big(-N(pk+q)q^{k-1}\big)\Big).
\tag2.1$$
Then we have
$$\aligned\Bbb E_N-\tilde\Bbb E_N=&\(\sum_{k\leq\loq N}+\sum_{\loq
N\loq N+2\loq \loq N}\)\cr
&\(\exp\(-(pk+q)q^{k-1}N\)-\(1-(pk+q)q^{k-1}\)^N\).\endaligned\tag2.2$$
We have split the sum into three parts and will give estimates for each part.
\roster
\item $k\leq\loq N$: In this range we have $(pk+q)q^{k-1}\geq\frac{p\loq
N}{qN}+\frac1N$.
Thus we obtain
$$\exp\(-(pk+q)q^{k-1}N\)\leq N^{-\frac p{q\log Q}}\quad\text{ and }
\(1-(pk+q)q^{k-1}\)^N\leq N^{-\frac p{q\log Q}}$$
and consequently we have $N^{-\frac p{q\log Q}}\loq N$ as an upper bound for
this
first sum.
\item $\loq N\loq N+2\loq \loq N$: Again we use Taylor expansion to obtain
$$\exp(-Nu)-(1-u)^N=O\(N(pk+q)^2q^{2k-2}\)$$
for $u=(pk+q)q^{k-1}$. This yields an upper bound
$$O\(N\sum_{k>\loq N+2\loq \loq N}(pk+q)^2q^{2k-2}\)=O\(\frac{\log^2N}N\)$$
for the sum in this range.
\endroster
Combining the three estimates yields
$$\Bbb E_N-\tilde\Bbb E_N=O\(\frac{\log^2N\log\log N}N\).$$
The standard technique for deriving asymptotic estimates for $\tilde\Bbb E_N$
uses Mellin transform and therefore again the analytic continuation of $f(s)$.
Thus we use a second approximation to $\Bbb E_N$ which finally yields the
desired
asymptotic expansion.
For this purpose we define
$$\Bbb E_N^*=\sum_{k\geq0}\Big(1-\exp\big(-(p\loq N+q)Nq^{k-1}\big)
\Big).\tag2.3$$
Now we have to give estimates for $\tilde\Bbb E_N-\Bbb E_N^*$. We compute
$$\aligned \tilde\Bbb E_N-\Bbb E_N^*&=\sum_{k\leq\loq N}\exp\(-(pk+q)q^{k-1}N\)
\Big(\exp\(-p(\loq N-k)q^{k-1}N\)-1\Big)\cr
&+\sum_{k>\loq N}\exp\(-(p\loq N+q)q^{k-1}N\)
\Big(1-\exp\(-p(k-\loq N)q^{k-1}N\)\Big).\endaligned\tag2.4$$
As above we get the estimate $O(N^{-\frac p{q\log Q}}\log N)$ for the first sum.
For the second sum we use the estimate
$$1-\exp\(-p(k-\loq N)q^{k-1}N\)=O\(p(k-\loq N)q^{k-1}N\).$$
We split the range of summation into two parts:
\roster
\item $\loq N\loq N+\loq \loq N+\loq p+1$: in this range we have $N\loq N\leq
q^{-k}$
and thus the remaining sum can be estimated using the summation formula for the
geometric series
$$O\(\sum_{k>\loq N+\loq \loq N}p(k-\loq N)q^{k-1}N\)=O\(\frac{\log\log
N}{\log N}\).$$
\endroster
Therefore we arrive at the final estimate
$$\Bbb E_N-\Bbb E_N^*=O\(\frac{(\log\log N)^2}{\log N}\).$$
Notice now that $\Bbb E_N^*$ is the sum
$$S(T)=\sum_{k\geq0}\Big(1-\exp\(-Tq^k\)\Big)$$
evaluated at $T=\frac pqN\loq N+N$. This sum occurred in many asymptotic
investigations,
e.g. \cite{6}. By Mellin's inversion formula \cite{1,2} we have
$$S(T)=\frac1{2\pi i}\int\limits_{-\frac12-i\infty}^{-\frac12+i\infty}
\Gamma(s)\frac{T^{-s}ds}{Q^s-1}.$$
Hence we get the asymptotic expansion of $S(T)$ by shifting the line of
integration
to the right and taking residues into account
$$S(T)=\loq T+\frac12+\frac\gamma{\log Q}+F(\loq T)+O\big(\frac1T\big),$$
where $F$ is an infinitely differentiable periodic function. Thus we have proved
\proclaim{Theorem 1}
$$\Bbb E_N^*=\loq N+\loq \loq N+\loq p+\frac32+\frac\gamma{\log Q}+
F\(\loq N+\loq \loq N+\loq p\)+O\(\frac{(\log\log N)^2}{\log N}\)\tag2.5$$
where $F$ is a $C^\infty$ periodic function of period $1$ and mean value $0$
whose Fourier-coefficients are given by
$$\hat F(k)=-\frac1{\log Q}\Gamma\(-\frac{2k\pi i}{\log Q}\)
\quad\text{{\rm for} } k\in\Bbb Z\setminus\{0\}.$$
\endproclaim
\heading 3. Concluding Remarks\endheading
If we translate the asymptotics of $\Bbb E_N^*$ back to analytic properties of
the Mellin-transform of $S(T\loq T)$, we encounter logarithmic singularities
(corresponding to the $\log\log$-terms) at $s=\frac{2k\pi i}{\log Q}$ for all
$k\in\Bbb Z$. These singularities would give us an infinity of branch-cuts which
are the reason for the difficulties mentioned in the introduction. Notice, that
this is not a proof for the existence of an analytic continuation of the
function
$f(s)$ in \thetag{1.3}.
The method used above to find an asymptotic expansion for the sum
$$\sum_{k\geq0}\(1-\(1-(pk+q)q^{k-1}\)^N\)\tag3.1$$
can be immediately extended to find the expansion for
$$S_g(N)=\sum_{k\geq 0}\(1-\(1-g(k)q^k\)^N\)$$
where $g(k)\sim\beta k^\alpha$ and $g(k)>0$ for $k\geq 0$.
The resulting asymptotic expansion is given by
$$S_g(N)=\loq N+\alpha\loq\loq N+\loq\beta+\frac12+\frac\gamma{\log Q}+
F(\loq N+\alpha\loq\loq N+\loq\beta)+o(1)\tag3.2$$
where $F$ is the periodic function of Theorem~1.
This general result can be used to compute the expected maximal lifetime of $N$
players who die after $b$ hits.
It is quite plain to see that
$$
\P\{X>k\}=q^k+kpq^{k-1}+\dots+\binom{k}{b-1}p^{b-1}q^{k-b+1}
.
$$
Hence the $g(k)$ from before is
$$
g(k)=1+k\frac pq+\dots+\binom{k}{b-1}\big(\frac pq \big)^{k-1}
\sim \frac{k^{b-1}}{(b-1)!}\big(\frac pq\big)^{k-1}
$$
Therefore this expectation is given by
$$\aligned\Bbb E_N^{(b)}&=\loq N+(b-1)\loq\loq N+
(b-1)\loq p+(b-1)-\loq(b-1)!+\frac12+
\frac\gamma{\loq Q}\cr
&+F\bigl(\loq N+(b-1)\loq N+(b-1)\loq
p-\loq(b-1)!\bigr)+o(1).\endaligned\tag3.3$$
\smallskip
This result tells us that, if we allow an additional hit, the maximum
of $N$ players will roughly increase by $\loq\loq N$.
\Refs
\ref\no 1\by G.Doetsch\book Handbuch der Laplace-Transformation\publ
Birkh\"auser
\publaddr Basel\yr1950\endref
\ref \no2 \by P.Flajolet, M.R\'egnier, and R.Sedgewick \paper
Some Uses of the Mellin Integral Transform in the Analysis of Algorithms
\inbook Combinatorics on Words \publ Springer NATO ASI Series F, Volume 12
\publaddr Berlin \yr 1985
\endref
\ref \no 3 \paper The Fibonacci Killer \toappear \by P.Grabner and
H.Prodinger\jour The Fibonacci Quarterly\yr 1994
\endref
\ref \no 4 \book Urn models and Their Applications\by N.L.Johnson
and S.Kotz \yr 1977 \publ John Wiley and Sons \publaddr New York
\endref
\ref \no 5\paper A Result in Order Statistics Related to Probabilistic
Counting \jour Computing \vol51\pages 15--27\yr1993\by
P.Kirschenhofer and H.Prodinger
\endref
\ref \no6 \book The Art of Computer Programming, Vol.3
\by D.E.Knuth\yr 1973\publ Addison--Wesley\publaddr Reading, MA
\endref
\ref \no 7 \by
W.Szpankowski and V.Rego\vol 43
\paper Yet another application of a binomial recurrence:
Order Statistics \jour Computing\yr1990\pages401--410
\endref
\endRefs
\enddocument