18 概率极限

18.1 几乎必然收敛

定理18.1 随机向量序列\(\boldsymbol\xi_n\) a.s. 收敛到\(\boldsymbol\xi\)\(g(\cdot)\)\(\mathbb R^n \to \mathbb R\)的连续函数, 则\(g(\boldsymbol\xi_n)\) a.s. 收敛到\(g(\boldsymbol\xi)\)

如果\(g(\cdot)\)定义是开集或者闭集\(G\)上的连续函数, \(\boldsymbol\xi_n\)\(\boldsymbol\xi\)都取值于\(G\), 则定理结论也成立。

18.2 依概率收敛

定理18.2 \(\{ a_n \}\)为实数列,则\(a_n \stackrel{\mbox{P}}{\rightarrow} a\) 等价于\(\lim_{n\to\infty} a_n = a\)

证明 充分性。设\(\lim_n a_n = a\),则\(\forall \delta>0\), \(\exists N\)使得\(n > N\)\(|a_n - a| \leq \delta\)。 所以 \[\begin{aligned} \lim_n P(|a_n - a| > \delta) = \lim_n 0 = 0. \end{aligned}\]

必要性。 设\(a_n \stackrel{\mbox{P}}{\rightarrow} a\)\(\forall \delta > 0\), 记\(p_n = P(|a_n - a| > \delta)\), 则\(\lim_n p_n = 0\)。 但是\(p_n\)只能取0或者1, 所以\(\{ p_n \}\)中只有有限个1, 所以存在\(N\)使得\(n>N\)\(P(|a_n - a| > \delta)=0\), 即\(n>N\)\(|a_n - a| \leq \delta\), 即\(\lim_n a_n = a\)

○○○○○○

定理18.3 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} \eta\), 则 \[\begin{aligned} \xi_n + \eta_n \stackrel{\mbox{P}}{\rightarrow} \xi + \eta. \end{aligned}\]

证明 任给定\(\varepsilon>0\)\[\begin{aligned} & P(|(\xi_n + \eta_n) - (\xi+\eta)| \geq \varepsilon) \\ \leq& P(|\xi_n - \xi| + |\eta_n - \eta| \geq \varepsilon) \\ \leq& P(|\xi_n - \xi| \geq \frac{\varepsilon}{2} \text{ 或 } |\eta_n - \eta| \geq \frac{\varepsilon}{2}) \\ \leq& P(|\xi_n - \xi| \geq \frac{\varepsilon}{2}) + P(|\eta_n - \eta| \geq \frac{\varepsilon}{2}) \\ \to& 0, \quad(n\to\infty) \end{aligned}\]

○○○○○○

定理18.4 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), \(a\)为常数, 则 \[\begin{aligned} a \xi_n \stackrel{\mbox{P}}{\rightarrow} a \xi. \end{aligned}\]

证明 \(a=0\)时显然。 当\(a \neq 0\)时,\(\forall \varepsilon>0\), \[\begin{aligned} &P(|a \xi_n - a \xi| \geq \varepsilon) \\ =& P(|\xi_n - \xi| \geq \frac{\varepsilon}{|a|}) \to 0 \quad(n\to\infty) \end{aligned}\]

引理18.1 \(\{\xi_n \}\), \(\{ \eta_n \}\)为两个随机序列, \(a, b, c\)为常数, 若\(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} \eta\), 则 \(a \xi_n + b\eta_n + c \stackrel{\mbox{P}}{\rightarrow} a\xi + b \eta + c\)

定理18.5 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} a\), \(a\)为常数, 函数\(g(\cdot)\)\(a\)连续,则 \[\begin{aligned} g(\xi_n) \stackrel{\mbox{P}}{\rightarrow} g(a). \end{aligned}\]

证明 \(\forall \varepsilon>0\), \(\exists \delta>0\)使得\(|x-a|<\delta\)\(|g(x) - g(a)| < \varepsilon\)。 于是 \[\begin{aligned} |g(x) - g(a)| \geq \varepsilon \Longrightarrow |x-a| \geq \delta \end{aligned}\] 于是 \[\begin{aligned} P(|g(\xi_n) - g(a)| \geq \varepsilon) \leq P(|\xi_n - a| \geq \delta) \to 0, \quad(n\to\infty) \end{aligned}\]

○○○○○○

例如,若\(\xi_n \stackrel{\mbox{P}}{\rightarrow} a\), 则 \[\begin{aligned} \xi_n^2 \stackrel{\mbox{P}}{\rightarrow}& a^2 \\ 1/\xi_n \stackrel{\mbox{P}}{\rightarrow}& 1/a \quad(\text{只要}a\neq 0)\\ \sqrt{\xi_n} \stackrel{\mbox{P}}{\rightarrow}& \sqrt{a} \quad(\text{只要} a\geq 0) \end{aligned}\]

定理18.6 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), \(g(\cdot)\)为连续函数,则 \[\begin{aligned} g(\xi_n) \stackrel{\mbox{P}}{\rightarrow} g(\xi). \end{aligned}\]

证明参考Tucker, H.G.(1967), A Graduate Course in Probability, New York: Academic Press.

定理18.7 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} \eta\), 则 \[\begin{aligned} \xi_n \eta_n \stackrel{\mbox{P}}{\rightarrow} \xi \eta. \end{aligned}\]

证明

\[\begin{aligned} \xi_n \eta_n =& \frac{1}{2}\left[ \xi_n^2 + \eta_n^2 - (\xi_n - \eta_n)^2 \right] \end{aligned}\] 其中\(\xi_n^2 \stackrel{\mbox{P}}{\rightarrow} \xi^2\), \(\eta_n^2 \stackrel{\mbox{P}}{\rightarrow} \eta^2\), \(\xi_n - \eta_n \stackrel{\mbox{P}}{\rightarrow} \xi - \eta\), \((\xi_n - \eta_n)^2 \stackrel{\mbox{P}}{\rightarrow} (\xi - \eta)^2\), 所以 \[\begin{aligned} \xi_n \eta_n \stackrel{\mbox{P}}{\rightarrow}& \frac{1}{2}\left[ \xi^2 + \eta^2 - (\xi-\eta)^2 \right] = \xi \eta \end{aligned}\]

○○○○○○

18.3 依分布收敛

定理18.8 \(\xi_n \stackrel{\mbox{P}}{\rightarrow} \xi\), 则\(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\)

证明\(\xi_n \sim F_n(\cdot)\), \(\xi \sim F(\cdot)\), 设\(x\)\(F(\cdot)\)的一个连续点。 对任意\(\epsilon>0\), \[\begin{aligned} F_n(x) =& P(\xi_n \leq x) \\ =& P\left( \xi_n \leq x, \ |\xi_n - \xi| < \epsilon \right) + P\left( \xi_n \leq x, \ |\xi_n - \xi| \geq \epsilon \right) \\ \leq& P(\xi \leq x+\epsilon) + P(|\xi_n - \xi| \geq \epsilon) \end{aligned}\] 于是 \[\begin{aligned} \varlimsup_{n\to\infty} F_n(x) \leq F(x+\epsilon). \end{aligned}\] 另一方面, \[\begin{aligned} P(\xi_n>x) =& P(\xi_n > x, \ |\xi_n - \xi| < \epsilon) + P(\xi_n > x, \ |\xi_n - \xi| \geq \epsilon) \\ \leq& P(\xi > x - \epsilon) + P(|\xi_n - \xi| \geq \epsilon), \\ \varlimsup_{n\to\infty} \left[ 1 - F_n(x) \right] \leq& 1 - F(x - \epsilon), \\ \varliminf_{n\to\infty} F_n(x) \geq& F(x - \epsilon), \end{aligned}\] 总之有 \[\begin{aligned} F(x-\epsilon) \leq \varliminf_{n\to\infty} F_n(x) \leq \varlimsup_{n\to\infty} F_n(x) \leq F(x+\epsilon), \end{aligned}\]\(\epsilon \to 0+\)则可知 \[\begin{aligned} \lim_{n\to\infty} F_n(x) = F(x). \end{aligned}\]

○○○○○○

定理18.9 \(a\)是常数,\(\xi_n \stackrel{\mbox{d}}{\rightarrow} a\), 则\(\xi_n \stackrel{\mbox{P}}{\rightarrow} a\)

证明\[\begin{aligned} F(x) =& \begin{cases} 1, & x \geq a \\ 0, & x < a \end{cases} \end{aligned}\]\[\begin{aligned} P(\xi_n \leq x) \to F(x), \ \forall x \neq a. \end{aligned}\]

\(\forall \delta>0\), \[\begin{aligned} & P(|\xi_n - a| > \delta) \\ =& P(\xi_n > a + \delta) + P(\xi_n < a - \delta) \\ =& 1 - P(\xi_n \leq a + \delta) + P(\xi_n < a - \delta) \\ \leq& 1 - P(\xi_n \leq a + \delta) + P(\xi_n \leq a - \delta) \\ \to& 0 \quad(n \to \infty) \end{aligned}\]

○○○○○○

依分布收敛不一定依概率收敛。 比如,设\(X \sim \text{N}(0,1)\), 则\(-X\)\(X\)同分布。 令 \[\begin{aligned} X_n = \begin{cases} X, & \text{$n$为偶数}, \\ -X, & \text{$n$为奇数}, \end{cases} \end{aligned}\]\(\{ X_n \}\)依分布收敛到\(X\), 但是不依概率收敛到\(X\)

概率质量函数(PMF)的收敛性与分布函数收敛性不同。 例如,取\(\xi_n = 2 + \frac{1}{n}\), 则\(\xi_n\)的PMF为 \[\begin{aligned} p_n(x) = \begin{cases} 1, & x = 2+\frac{1}{n}, \\ 0, & \text{其它} \end{cases}, \end{aligned}\]\[\begin{aligned} \lim_n p_n(x) = 0, \ x \in (-\infty, \infty), \end{aligned}\] 但是\(\xi_n\)的分布函数趋于\(\xi=2\)的分布函数。

如果\(\xi_n\)的密度函数\(p_n(x) \to p(x)\)\(p(x)\)\(\xi\)的密度函数,\(p_n(x)\)有可积的上界, 则根据控制收敛定理可知, \(\xi_n\)依分布收敛到\(\xi\)

定理18.10 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} 0\), 则 \[\begin{aligned} \xi_n + \eta_n \stackrel{\mbox{d}}{\rightarrow} \xi. \end{aligned}\]

证明\(x_0\)\(\xi\)的分布函数\(F(x)\)的连续点。 对于\(\delta>0\),由 \[\begin{aligned} & P(\xi_n + \eta_n \leq x_0) \\ =& P(\xi_n + \eta_n \leq x_0,\ \eta_n \leq -\delta) + P(\xi_n + \eta_n \leq x_0,\ \eta_n > -\delta) \\ \leq& P(\eta_n \leq -\delta) + P(\xi_n \leq x_0 + \delta) \\ \leq& P(|\eta_n| \geq \delta) + P(\xi_n \leq x_0 + \delta) \end{aligned}\] 可知 \[\begin{align} & P(\xi_n + \eta_n \leq x_0) - F(x_0) \\ \leq& [ P(\xi_n \leq x_0 + \delta) - F(x_0 + \delta) ] + [ F(x_0 + \delta) - F(x_0) ] + P(|\eta_n| \geq \delta) \tag{18.1} \end{align}\]

另一方面, \[\begin{aligned} & P(\xi_n + \eta_n \leq x_0) \\ \geq& P(\xi_n + \eta_n \leq x_0,\ \eta_n \leq \delta) \\ \geq& P(\xi_n + \delta \leq x_0,\ \eta_n \leq \delta) \\ =& P(\xi_n + \delta \leq x_0 ) - P(\xi_n + \delta \leq x_0,\ \eta_n > \delta) \\ \geq& P(\xi_n \leq x_0 - \delta) - P(\eta_n > \delta) \\ \geq& P(\xi_n \leq x_0 - \delta) - P(|\eta_n| > \delta) \end{aligned}\] 于是 \[\begin{align} & P(\xi_n + \eta_n \geq x_0) - F(x_0) \\ \geq& P(\xi_n \leq x_0 - \delta) - F(x_0 - \delta) + F(x_0 - \delta) - F(x_0) - P(|\eta_n| > \delta) \tag{18.2} \end{align}\]

任给定\(\varepsilon>0\),取\(\delta_1>0\)足够小使得 \[\begin{aligned} F(x_0 + \delta_1) - F(x_0) < \frac{\varepsilon}{3} \end{aligned}\]\(x_0 + \delta_1\)\(F(x)\)的连续点(由Lebesgue定理, 单调函数几乎处处可微,从而在任意小的区间上都有连续点)。 由于 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} 0\), 存在\(n_1\)使得\(\forall n \geq n_1\)\[\begin{aligned} P(\xi_n \leq x_0 + \delta) - F(x_0 + \delta) <& \frac{\varepsilon}{3} \\ P(|\eta_n| \geq \delta) <& \frac{\varepsilon}{3} \end{aligned}\](18.1)式, 当\(n \geq n_1\)\[\begin{aligned} P(\xi_n + \eta_n \leq x_0) - F(x_0) < \varepsilon \end{aligned}\]

再取\(\delta_2 > 0\)使 \[\begin{aligned} F(x_0) - F(x_0 - \delta_2) < \frac{\varepsilon}{3} \end{aligned}\]\(x_0 - \delta_2\)\(F(x)\)的连续点,存在\(n_2 \geq n_1\)使得 \(\forall n \geq n_2\)\[\begin{aligned} P(\xi_n \leq x_0 - \delta) - F(x_0 - \delta) >& - \frac{\varepsilon}{3} \\ P(|\eta_n| \geq \delta) <& \frac{\varepsilon}{3} \end{aligned}\](18.2)可得当\(n \geq n_2\)\[\begin{aligned} P(\xi_n + \eta_n \leq x_0) - F(x_0) > -\varepsilon \end{aligned}\] 于是当\(n \geq n_2\)\[\begin{aligned} |P(\xi_n + \eta_n \leq x_0) - F(x_0)| < \varepsilon \end{aligned}\] 即有 \[\begin{aligned} \lim_{n\to\infty} P(\xi_n + \eta_n \leq x_0) = F(x_0) \end{aligned}\]\[\begin{aligned} \xi_n + \eta_n \stackrel{\mbox{d}}{\rightarrow} \xi. \end{aligned}\]

○○○○○○

定理18.11 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), \(\eta_n \stackrel{\mbox{P}}{\rightarrow} 1\), 则\(\eta_n \xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\)

证明与定理18.10类似。

定理18.12 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), \(g(\cdot)\)是定义在\(\xi\)的支撑集上的连续函数, 则 \[\begin{aligned} g(\xi_n) \stackrel{\mbox{d}}{\rightarrow} g(\xi). \end{aligned}\]

证明略。例如,\(\xi_n \stackrel{\mbox{d}}{\rightarrow} \text{N}(0,1)\), 则\(\xi_n^2 \stackrel{\mbox{d}}{\rightarrow} \chi^2(1)\)

定理18.13 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), \(A_n \stackrel{\mbox{P}}{\rightarrow} a\), \(B_n \stackrel{\mbox{P}}{\rightarrow} b\), 则 \[\begin{aligned} A_n + B_n \xi_n \stackrel{\mbox{d}}{\rightarrow} a + b\xi. \end{aligned}\]

证明略。

定理18.14 设取正整数整数值的随机变量序列\(\{N_n \}\)满足\(N_n \leq N_{n+1}, \forall n\)\(\xi_n\)依分布收敛到\(\xi\)

(1) 如果\(\{ N_n \}\)\(\{ \xi_n \}\)相互独立, 则\(\xi_{N_n}\)依分布收敛到\(\xi\)

(2) 如果存在\(c>0\)使得\(N_n / n \to c\), a.s., 则\(\xi_{N_n}\)依分布收敛到\(\xi\)

其中的第二条称为Anscombe定理。 证明略。

定理18.15 设随机变量序列\(\{ \xi_n \}\)依分布收敛到\(\xi\)\(p>0\), 存在非负随机变量\(\eta\)使得\(E\eta<\infty\)\(|\xi_n| \leq \eta, \forall n\), 则\(E \xi_n^p \to E \xi^p\)

定理18.16 设随机变量序列\(\{ \xi_n \}\)依分布收敛到\(\xi\)\(p>0\)\(E \xi_n^p\)满足如下的一致可积性条件: \[ \lim_{c \to \infty} \sup_{n \geq 1} E \left\{ |\xi_n|^p \boldsymbol 1_{\{ |\xi_n|^p > c \}} \right\} = 0 \]\(E \xi_n^p \to E \xi^p\)

证明略。

定理18.17 \(\{ F_1, F_2, \dots, \}\)是分布函数列, \(F_n\)依分布收敛到\(F\), 若对某一个\(\delta>0\)\(p_0 > 0\), 数列\(\{ \int_R |x|^{p_0+\delta} d F_n(x): n \geq 1 \}\)有界, 则对任意\(p \in [0, p_0]\)和整数\(k \in (0, p_0]\), 当\(n \to \infty\)时有 \[\begin{aligned} \int_R x^k d F_n(x) \to& \int_R x^k d F(x) , \quad \int_R |x|^p d F_n(x) \to& \int_R |x|^p d F(x) \end{aligned}\]

见朱成熹《测度论基础》(科学出版社1983年)P126的推论。 即二阶矩有界加依分布收敛推出一阶矩收敛; 三阶矩有界加依分布收敛推出二阶矩收敛。

反例:设\(X_n \sim 2 n \text{U}(0, \frac{1}{n})\), 则\(X_n \to 0\), a.s.。 \(EX_n \equiv 1\)\(E 0 = 0\)

18.4 概率母函数

\(X\)为取非负整数值的离散型随机变量, \(P(X=j) = p_j\), 则 \[ P(t) = E t^X = \sum_{j=0}^\infty t^j p_j \]\(t \in [-1, 1]\)一致绝对收敛, 称\(P(t)\)\(X\)的概率母函数。

定理18.18 设取值为非负整数的随机变量\(X\)有概率母函数\(P(t) = E t^X\), \(t \in [-1,1]\), 和分布列\(\{ p_j \}\), 则分布列与概率母函数相互唯一决定。

分布列决定概率母函数显然; 反之的证明略。 参见(李贤平 2010)节4.4, (何书元 2006)节5.1。

\[ \begin{aligned} EX =& P'(1) \\ E(X(X-1)) =& P''(1) \\ \text{Var}(X) =& EX^2 - (EX)^2 = P''(1) + P'(1) - [P'(1)]^2 \end{aligned} \]

18.5 矩母函数

对随机变量\(X\), 如果存在\(h>0\)使得 \[ M(t) = E e^{t X} \]\(t \in (-h, h)\)存在, 则称\(M(t)\)\(X\)矩母函数。 条件也可以放松到在\(t \in [0, h)\)存在, 或者在\(t \in (-h, 0]\)存在。

矩母函数存在时, \[ \begin{aligned} \frac{d}{dt} M(t) =& E (X e^{t X}), \quad \frac{d}{dt} M(0) = EX \\ \frac{d^2}{dt^2} M(t) =& E (X^2 e^{t X}), \quad \frac{d^2}{dt^2} M(0) = E(X^2) \\ & \cdots\cdots \\ \frac{d^n}{dt^n} M(t) =& E (X^n e^{t X}), \quad \frac{d^n}{dt^n} M(0) = E(X^n) \end{aligned} \] 这也是“矩母函数”的名称来源。

定理18.19 设随机变量\(X\)有矩母函数\(M(t) = E e^{t X}\), \(t \in (-h, h)\), 和分布函数\(F(x)\), 则分布函数与矩母函数相互唯一决定。

分布函数决定矩母函数显然, 矩母函数决定分布函数的证明略。

定理18.20 \(\xi_n\)有矩母函数\(M_n(t) = E e^{t \xi_n}\), \(t \in (-h, h)\), \(\xi\)有矩母函数\(M(t) = E e^{t \xi}\), \(t \in [-h_1, h_1]\), \(0 < h_1 \leq h\), 若\(\lim_{n\to\infty} M_n(t) = M(t)\), \(\forall t \in [-h_1, h_1]\), 则\(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\)

证明略。

18.6 特征函数

概率母函数与矩母函数存在时都可以决定分布, 对于研究独立随机变量和的分布有好的性质, 但都不是对所有随机变量存在的。 特征函数具有相同的优良性, 而且对于所有的随机变量都存在, 只不过用到复数, 数学上较复杂。

对随机变量\(X\), 定义 \[ \phi(t) = E e^{iX} = E \cos (tX) + i\, E \sin (tX), \ t \in \mathbb R \]\(\phi(t)\)\(X\)特征函数。 特征函数存在唯一。

定理18.21 (逆转公式) 如果\(X\)的分布函数\(F(x)\)\(a, b\)连续,则 \[ F(b) - F(a) = \frac{1}{2\pi} \lim_{T \to \infty} \int_{-T}^T \frac{e^{ita} - e^{itb}}{it} \phi(t) \, dt \]

证明略。

定理18.22 随机变量的分布函数与特征函数相互唯一决定。

特征函数的性质:

定理18.23 \(X\)为随机变量, \(\phi(t) = E e^{itX}\),则

(1) \(\phi(0) = 1\), \(|\phi(t)| \leq 1\), \(\phi(-t) = \overline{\phi(t)}\)

(2) \(\phi(t)\)\((-\infty, \infty)\)一致连续。

(3) 如果\(E X^k\)存在,则 \[ \phi^{(k)}(t) = i^k E(X^k e^{itX}), \quad \phi^{(k)}(0) = i^k E (X^k) \]

(4) 非负定性:对任意复数\(a_1, a_2, \dots, a_n\), 和实数\(t_1, t_2, \dots, t_n\),都有 \[ \sum_{k=1}^n \sum_{j=1}^n \phi(t_k - t_j) a_k \bar a_j \geq 0 \]

(5) 设随机变量\(X_1, X_2, \dots, X_n\)相互独立, 如果\(X_j\)的特征函数为\(\phi_j(t)\), 令\(Y = \sum_{j=1}^n X_j\), 则\(Y\)的特征函数为 \[ \phi_Y(t) = \prod_{j=1}^n \phi_j(t) . \]

(何书元 2006) 节5.2。

正态分布N(\(\mu\), \(\sigma^2\))的特征函数为 \[ \phi(t) = \exp\{ i \mu t - \frac12 \sigma^2 t^2 \} . \]

定理18.24 (连续性定理) \(\xi_n\)的特征函数为\(\phi_n(t)\)\(\xi\)的特征函数为\(\phi(t)\), 则\(\xi_n \stackrel{d}{\to} \xi\)的充分必要条件是 \[ \lim_{n\to\infty} \phi_n(t) = \phi(t), \ \forall t \in (-\infty, \infty) \]

证明略。

对随机向量\(\boldsymbol X\), 定义其特征函数为 \[ \phi(\boldsymbol t) = E e^{i \boldsymbol t^T \boldsymbol X}, \ \boldsymbol t \in \mathbb R^n \] 特征函数有定义。

定理18.25 (随机向量特征函数性质) \(\boldsymbol X = (X_1, \dots, X_n)^T\)\(\boldsymbol X\)的特征函数为\(\phi(\boldsymbol t)\), 分量\(X_j\)的特征函数为\(\phi_j(t)\)。则

(1) \(\phi(\boldsymbol t)\)\(\boldsymbol X\)的分布函数相互唯一决定;

(2) \(X_1, X_2, \dots, X_n\)相互独立当且仅当 \[ \phi(\boldsymbol t) = \phi_1(t_1) \phi_2(t_2) \dots \phi_n(t_n), \ \forall \boldsymbol t = (t_1, t_2, \dots, t_n) \in \mathbb R^n . \]

(3) 设\(\{\boldsymbol\xi_k \}\)为随机向量序列, \(\boldsymbol\xi_k\)的特征函数为\(\phi_k(\boldsymbol t)\), 如果\(\phi_k(\boldsymbol t)\)收敛到在\(\boldsymbol t = \boldsymbol 0\)连续的函数\(g(\boldsymbol t)\), 则\(g(\boldsymbol t)\)是某个随机向量\(\boldsymbol\xi\)的特征函数, 且对任意\(\boldsymbol a \in \mathbb R^n\), 有\(\boldsymbol a^T \boldsymbol\xi_k \stackrel{\mbox{d}}{\longrightarrow} \boldsymbol a^T \boldsymbol\xi\)

18.7 中心极限定理

定理18.26 \(\{ \xi_n \}\)为独立同分布的随机变量序列, \(\mu = E\xi_1\), \(\sigma^2 = \text{Var}(\xi_1) < \infty\)\(\bar \xi_n = \frac{1}{n} \sum_{i=1}^n \xi_i\), 则 \[ \frac{\xi_n - \mu}{\sigma / \sqrt{n}} \stackrel{\mbox{d}}{\longrightarrow} \mbox{N}(0,1) . \]

定理18.27 \(\{ \xi_n \}\)为独立同分布的随机变量序列, \(\mu = E\xi_1\), \(\sigma^2 = \text{Var}(\xi_1) < \infty\)\(\bar \xi_n = \frac{1}{n} \sum_{i=1}^n \xi_i\), 设\(\hat\sigma_n^2\)依概率收敛到\(\sigma^2\), 则 \[ \frac{\xi_n - \mu}{\hat\sigma_n / \sqrt{n}} \stackrel{\mbox{d}}{\longrightarrow} \mbox{N}(0,1) . \]

18.8 依概率有界

\(\{\xi_n\}\)是随机变量序列, 如果对任意\(\varepsilon>0\), 存在正数\(M\),使得 \[\begin{aligned} \sup_n P(|\xi_n|>M) \leq \varepsilon \end{aligned}\] 就称随机变量序列\(\{\xi_n\}\)依概率有界的, 记做\(\xi_n = O_p(1)\)。 对单个的随机变量\(\xi\)\(\forall \varepsilon>0\), 显然存在\(M>0\)使得\(P(|\xi| > M) \leq \varepsilon\), 一个序列的依概率有界是要求对整个序列, 给定\(\varepsilon>0\)后有共同的\(M\)使得\(P(|\xi_n| > M) \leq \varepsilon\)同时成立。

\(\{c_n\}\)是非零常数列, 如果\(\{\xi_n / c_n \} = O_p(1)\), 就称\(\xi_n = O_p(c_n)\)。 设随机变量序列\(\eta_n \neq 0\), 若\(\{\xi_n / \eta_n\} = O_p(1)\)则称 \(\xi_n = O_p(\eta_n)\)

依概率有界的等价定义: 称\(\{ \xi_n \}\)依概率有界, 若\(\forall \varepsilon>0\), \(\exists M>0\)\(N\)使得当\(n \geq N\)\[\begin{aligned} P(|\xi_n| \leq M) \geq 1 - \varepsilon. \end{aligned}\] 事实上,当原定义条件成立时显然此等价定义的条件也成立。 若此等价定义条件成立, 则对\(n \geq N\)\[\begin{aligned} P(|\xi_n| \leq M) \geq 1 - \varepsilon, \quad P(|\xi_n| > M) < \varepsilon. \end{aligned}\]\(j=1,2,\dots,N\), 存在\(M_j>0\)使得 \[\begin{aligned} P(|\xi_j| > M_j) < \varepsilon \end{aligned}\]\(M' = \max(M, M_1, M_2, \dots, M_N)\), 则 \[\begin{aligned} P(|\xi_n| > M') <& \varepsilon, \ \forall n \in \mathbb N_+, \\ \sup_{n \in \mathbb N_+} P(|\xi_n| > M') \leq& \varepsilon, \end{aligned}\] 满足原定义。

○○○○○○

\(\xi_n \stackrel{\text{Pr}}{\to} 0 \ (n\to\infty)\) 则记\(\xi_n = o_p(1)\)。 设\(\{c_n\}\)是非零常数列, 如果\(\{\xi_n / c_n \} = o_p(1)\), 就称\(\xi_n = o_p(c_n)\)。 设随机变量序列\(\eta_n \neq 0\), 若\(\{\xi_n / \eta_n \} = o_p(1)\)则称\(\xi_n = o_p(\eta_n)\)

定理18.28 \(\xi_n = o_p(c_n)\)\(\xi_n = O_p(c_n)\)

证明 按依概率收敛定义, \(\forall \delta>0\), \(\forall \varepsilon>0\), 存在\(N\)使\(n>N\)\[\begin{aligned} P( |\xi_n / c_n| > \delta) < \varepsilon \end{aligned}\]\(M \geq \delta\)使得 \[\begin{aligned} P(\max_{1 \leq n \leq N}|\xi_n / c_n| > M) < \varepsilon, \end{aligned}\]\[\begin{aligned} P(|\xi_n / c_n| > M) < \varepsilon, \quad n=1,2,\dots \end{aligned}\]\(\xi_n/c_n = O_p(1)\).

○○○○○○

定理18.29 如果\(\xi_n = O_p(1)\)\(c_n \to \infty\)\(\xi_n / c_n = o_p(1)\)

证明 \(\forall \varepsilon > 0\), \(\forall \delta>0\)\(\exists M > 0\)使 \[\begin{aligned} P(|\xi_n| > M) < \varepsilon, \ n \in \mathbb N_+. \end{aligned}\] \(\exists N\)使\(n > N\)\(c_n > M / \delta\), 于是 \[\begin{aligned} P \left (\left| \frac{\xi_n}{c_n} \right| > \delta \right) =& P(|\xi_n| > |c_n| \delta) \leq P(|\xi_n| > M) < \varepsilon. \end{aligned}\]

○○○○○○

定理18.30 对随机变量\(\xi\)\(\xi=O_p(1)\).

定理18.31 若存在非负随机变量\(\xi\)使得\(|\xi_n| \leq \xi\) a.s.则\(\xi_n=O_p(1)\)

证明\(P(|\xi_n| \leq \xi)=1\)\(\forall \varepsilon>0\), \(\exists M>0\)使\(P(\xi > M)<\varepsilon\), 于是\(P(|\xi_n|>M) \leq P(\xi > M) < \varepsilon\), \(\forall n \in \mathbb N_+\)

○○○○○○

定理18.32 若存在常数\(d>0\), \(c>0\)使得使得\(\sup_{n} E|\xi_n|^d \leq c\), 则\(\xi_n=O_p(1)\)

证明 由马尔可夫不等式, 对任意\(M>0\)\[ P(|\xi_n| > M) \leq \frac{E|\xi_n|^d}{M^d} \to 0 \ (M \to\infty) \] 所以对任意\(\epsilon>0\), 存在\(M>0\)使得 \[ P(|\xi_n| > M) < \epsilon, \ \forall n \]

○○○○○○

定理18.33 \(\{\xi_n\}\)同分布,则\(\xi_n=O_p(1)\)

证明 \(\forall \varepsilon>0\), \(\exists M>0\)使 \(P(|\xi_1|>M)<\varepsilon\)。 由同分布性知 \(P(|\xi_t|>M)=P(|\xi_1|>M)<\varepsilon\)

○○○○○○

定理18.34 \[\begin{aligned} O_p(1) \pm O_p(1) =& O_p(1) \\ O_p(1) \cdot O_p(1) =& O_p(1) \\ O_p(1) \pm o_p(1) =& O_p(1) \\ O_p(1) \cdot o_p(1) =& o_p(1). \end{aligned}\]

证明 仅证明\(O_p(1) \cdot o_p(1) = o_p(1)\)。 设\(\xi_n = O_p(1)\), \(\eta_n = o_p(1)\)。 对任意给定的\(\delta>0\)\(\epsilon>0\), 存在\(M\), 使得 \[\begin{aligned} \sup_n P(|\xi_n| > M) < \epsilon. \end{aligned}\] 于是 \[\begin{aligned} & \varlimsup_{n\to\infty} P( |\xi_n \eta_n| > \delta) \\ \leq& \varlimsup_{n\to\infty} P( |\xi_n \eta_n| > \delta, |\xi_n| \leq M) + \varlimsup_{n\to\infty} P( |\xi_n \eta_n| > \delta, |\xi_n| > M) \\ \leq& \varlimsup_{n\to\infty} P( |\eta_n| > \frac{\delta}{M} ) + \varlimsup_{n\to\infty} P( |\xi_n| > M) \\ \leq& \epsilon, \end{aligned}\]\(\lim_n P( |\xi_n \eta_n| > \delta) = 0\), \(\xi_n \eta_n = o_p(1)\)

○○○○○○

定理18.35 \(\xi_n\)依概率收敛到\(\xi\)\(\{\xi_n\} = O_p(1)\)

证明 \(\xi_n = \xi + (\xi_n - \xi) = O_p(1) + o_p(1)\)

○○○○○○

定理18.36 \(\xi_n \stackrel{\mbox{d}}{\rightarrow} \xi\), 则\(\xi_n = O_p(1)\)

证明\(\xi\)分布函数为\(F(x)\)\(\forall \varepsilon>0\), 存在\(M>0\)\(M\)\(-M\)\(F(x)\)的连续点,使得 \[\begin{aligned} \varliminf_{n\to\infty} P(|\xi_n| \leq M) =& \lim_{n\to\infty} P(\xi_n \leq M) - \varliminf_{n\to\infty} P(\xi_n < -M) \\ \geq& \lim_{n\to\infty} P(\xi_n \leq M) - \lim_{n\to\infty} P(\xi_n \leq -M) \\ =& F(M) - F(-M) \geq 1 - \frac{\varepsilon}{2} > 1 - \varepsilon \end{aligned}\] 于是\(\exists N\),当\(n \geq N\)\[\begin{aligned} \inf_{m\geq n} P(|\xi_m| \leq M) \geq& 1-\varepsilon \\ P(|\xi_n| \leq M) \geq& 1-\varepsilon \end{aligned}\]\(O_p\)等价定义可知\(\xi_n = O_p(1)\)

○○○○○○

定理18.37 \(\xi_n = O_p(1)\)\(\eta_n / \xi_n = o_p(1)\), 则\(\eta_n = o_p(1)\)

证明

\[ \eta_n = o_p(1) \cdot O_p(1) = o_p(1) . \] 也可以直接证明如下。

\(\forall \delta > 0\), \(\forall \varepsilon > 0\)。 由\(\xi_n = O_p(1)\)可知存在\(M>0\)使得 \[\begin{aligned} P(|\xi_n| > M) < \frac{\varepsilon}{2}, \ \forall n \in \mathbb N_+. \end{aligned}\]\(\eta_n / \xi_n = o_p(1)\)可知存在\(N>0\)使得\(n > N\)\[\begin{aligned} P(|\eta_n / \xi_n| > \delta / M) < \frac{\varepsilon}{2}, \end{aligned}\] 于是\(n > N\)\[\begin{aligned} & P(|\eta| > \delta) \\ =& P(|\eta| > \delta,\ |\xi_n| \leq M) + P(|\eta| > \delta,\ |\xi_n| > M) \\ \leq& P(|\eta_n / \xi_n| > \delta / M) + P(|\xi_n| > M) \\ <& \varepsilon \end{aligned}\]\(\eta_n = o_p(1)\)

○○○○○○

18.9 Delta方法

定理18.38 设随机变量序列\(\{ \xi_n \}\)有极限分布 \[\begin{aligned} \sqrt{n}(\xi_n - \theta) \stackrel{\text{d}}{\to} \text{N}(0, \sigma^2) \end{aligned}\] 函数\(g(x)\)\(\theta\)处可微, \(g'(\theta) \neq 0\)。 则 \[\begin{aligned} \sqrt{n}(g(\xi_n) - g(\theta)) \stackrel{\text{d}}{\to} \text{N}(0, \sigma^2 (g'(\theta))^2). \end{aligned}\]

证明 由泰勒公式 \[\begin{aligned} g(\xi_n) = g(\theta) + g'(\theta) (\xi_n - \theta) + \eta_n \end{aligned}\] 其中 \[\begin{aligned} \eta_n =& h(\xi_n - \theta) \\ h(x) =& g(x + \theta) - g(\theta) - g'(\theta) x \\ h'(0) =& 0 \end{aligned}\] 由后面的引理18.2可知 \(\eta_n = o_p(|\xi_n - \theta|)\),于是 \[\begin{aligned} \sqrt{n}(g(\xi_n) - g(\theta)) = \sqrt{n} g'(\theta) (\xi_n - \theta) + o_p(\sqrt{n} |\xi_n - \theta|) \end{aligned}\] 前一项依分布收敛到\(\text{N}(0, \sigma^2 (g'(\theta))^2)\), 后一项中\(\sqrt{n} |\xi_n - \theta| = O_p(1)\)所以是\(o_p(1)\)的, 于是结果可得。

引理18.2 设函数\(h(x)\)\(x=0\)处可微,\(h'(0)=0\)。 若\(\xi_n = o_p(1)\)\(h(\xi_n) = o_p(|\xi_n|)\)

引理证明 \(\forall \varepsilon>0\), \(\forall \delta>0\), 由\(h'(0)=0\)可知存在\(\delta_1 > 0\)使得对任意\(0 < |x| \leq \delta_1\)\[\begin{aligned} \left| \frac{h(x)}{x} \right| < \delta \end{aligned}\] 于是 \[\begin{aligned} & P \left(\left| \frac{h(\xi_n)}{\xi_n} \right| > \delta \right) \\ =& P \left(\left| \frac{h(\xi_n)}{\xi_n} \right| > \delta, \ |\xi_n| \leq \delta_1 \right) + P \left(\left| \frac{h(\xi_n)}{\xi_n} \right| > \delta, \ |\xi_n| > \delta_1 \right) \\ =& P \left( \left| \frac{h(\xi_n)}{\xi_n} \right| > \delta, \ |\xi_n| > \delta_1 \right) \\ \leq& P(|\xi_n| > \delta_1) \to 0 \quad (n \to \infty) \end{aligned}\]

○○○○○○

引理18.3 \[\begin{aligned} \lim_{n\to\infty} \left(1 + \frac{b}{n} + o(\frac{1}{n}) \right)^{cn} = e^{bc}. \end{aligned}\]

18.10 随机向量的极限

定义18.1 \(\{ \boldsymbol\xi_n \}\)为随机向量序列, \(\boldsymbol\xi\)为随机向量,如果对任意\(\delta>0\)都有 \[\begin{aligned} \lim_{n\to\infty} P( | \boldsymbol\xi_n - \boldsymbol\xi | > \delta ) = 0, \end{aligned}\] 则称\(\boldsymbol\xi_n\)依概率收敛到\(\boldsymbol\xi\), 记作\(\boldsymbol\xi_n \stackrel{\text{P}}{\to} \boldsymbol\xi\)

定义中\(| \boldsymbol\xi_n - \boldsymbol\xi |\)两边的竖线代表欧式长度。

定理18.39 \(\boldsymbol\xi_n = (\xi_{n1}, \dots, \xi_{nm})^T\), \(\boldsymbol\xi = (\xi_{1}, \dots, \xi_{m})^T\), 则\(\boldsymbol\xi_n \stackrel{\text{P}}{\to} \boldsymbol\xi\) 当且仅当\(\xi_{nj} \stackrel{\text{P}}{\to} \xi_j\), \(j=1,\dots,m\)

定义18.2 \(\{ \boldsymbol\xi_n \}\)为随机向量序列, \(\boldsymbol\xi_n\)分布函数为\(F_n(\boldsymbol x)\), \(\boldsymbol\xi\)为随机向量, 有分布函数\(F(\boldsymbol x)\), 如果对\(F(\cdot)\)的任意连续点\(\boldsymbol x\)均有 \[\begin{aligned} \lim_{n\to\infty} F_n(\boldsymbol x) = F(\boldsymbol x), \end{aligned}\] 则称\(\boldsymbol\xi_n\)依分布收敛到\(\boldsymbol\xi\)或依分布收敛到\(F(\cdot)\), 记作\(\boldsymbol\xi_n \stackrel{\text{d}}{\to} \boldsymbol\xi\)\(\boldsymbol\xi_n \stackrel{\text{d}}{\to} F(\cdot)\)

定理18.40 \(\{ \boldsymbol\xi_n \}\)为随机向量序列, \(\boldsymbol\xi\)为随机向量, \(\boldsymbol\xi_n \stackrel{\text{d}}{\to} \boldsymbol\xi\), 函数\(g(\boldsymbol x)\)是定义于\(\boldsymbol\xi\)的支撑集上的连续函数, 则\(g(\boldsymbol\xi_n)\)依分布收敛于\(g(\boldsymbol\xi)\)

推论: 随机向量依分布收敛, 则相应分量依分布收敛。

定理18.41 \(\boldsymbol\xi_n\)有矩母函数\(M_n(\boldsymbol t) = E e^{\boldsymbol t^T \boldsymbol\xi_n}\), \(\boldsymbol\xi\)有矩母函数\(M(t) = E e^{\boldsymbol t^T \boldsymbol\xi}\), 若\(\lim_{n\to\infty} M_n(\boldsymbol t) = M(\boldsymbol t), \, \| t \| \leq h\)(\(h>0\)), 则\(\boldsymbol\xi_n \stackrel{\mbox{d}}{\rightarrow} \boldsymbol\xi\)

定理18.42 (随机向量的中心极限定理) 设独立同分布随机向量序列\(\{ \boldsymbol\xi_n \}\)具有共同的期望\(\boldsymbol\mu\)和协方差阵\(\Sigma\)\(\Sigma\)正定, 设共同的矩母函数\(M(\boldsymbol t)\)\(\boldsymbol 0\)的一个开邻域存在, 令 \[\begin{aligned} \boldsymbol\eta_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n (\boldsymbol\xi_i - \boldsymbol\mu) = \sqrt{n}(\bar{\boldsymbol\xi} - \boldsymbol\mu), \end{aligned}\]\(\boldsymbol\eta_n\)依分布收敛到\(\text{N}_m(\boldsymbol 0, \Sigma)\)分布。

定理18.43 \(m\)维随机向量序列\(\{ \boldsymbol\xi_n \}\)渐近\(\text{N}_m(\boldsymbol\mu, \Sigma)\)分布, \(A, \boldsymbol b\)为非随机的矩阵和向量, 则\(A \boldsymbol\xi_n + \boldsymbol b\)渐近\(\text{N}_m(A\boldsymbol\mu + \boldsymbol b, A \Sigma A^T)\)分布。

定理18.44 \(m\)维随机向量序列\(\{ \boldsymbol\xi_n \}\)满足 \[\begin{aligned} \sqrt{n}(\boldsymbol\xi_n - \boldsymbol\mu_0) \stackrel{\mbox{d}}{\rightarrow} \text{N}_m(\boldsymbol 0, \Sigma), \end{aligned}\]\(\boldsymbol g(\boldsymbol x)\)为一个\(\mathbb R^m\)\(\mathbb R^k\)的变换(\(k \leq m\)), 把各个一阶偏导数组成一个矩阵 \[\begin{aligned} B = \left( \frac{\partial g_i(\boldsymbol x)}{\partial x_j} \right)_{ \substack{i=1,\dots,k;\\ j=1,\dots,m}}, \end{aligned}\] 设在\(\boldsymbol\mu_0\)的某个邻域内,\(B\)的各个元素连续且\(B\)不等于零矩阵, 记\(B\)\(\boldsymbol x= \boldsymbol\mu_0\)处的值为\(B_0\),则 \[\begin{aligned} \sqrt{n}(\boldsymbol g(\boldsymbol\xi_n) - \boldsymbol g(\boldsymbol\mu_0)) \stackrel{\mbox{d}}{\rightarrow} \text{N}_m(\boldsymbol 0, B_0 \Sigma B_0^T). \end{aligned}\]

References

———. 2006. 概率论. 北京大学出版社.
李贤平. 2010. 概率论基础. 第三版. 高等教育出版社.