# factorization criterion

Let $\boldsymbol{X}=(X_{1},\ldots,X_{n})$ be a random vector whose coordinates  are observations, and whose probability (density  ) function is, $f(\boldsymbol{x}\mid\theta)$ where $\theta$ is an unknown parameter. Then a statistic   $T(\boldsymbol{X})$ for $\theta$ is a sufficient statistic iff $f$ can be expressed as a product of (or factored into) two functions $g,h$, $f=gh$ where $g$ is a function of $T(\boldsymbol{X})$ and $\theta$, and $h$ is a function of $\boldsymbol{x}$. In symbol, we have

 $f(\boldsymbol{x}\mid\theta)=g(T(\boldsymbol{X}),\theta)h(\boldsymbol{x}).$

Applications.

1. 1.

In view of the above statement, let’s show that the sample mean $\overline{X}$ of $n$ independent  observations from a normal distribution  $N(\mu,\sigma^{2})$ is a sufficient statistic for the unknown mean $\mu$. Since the $X_{i}$’s are independent random variables  , then the probability density function  $f(\boldsymbol{x}\mid\mu)$, being the joint probability density function of each of the $X_{i}$, is the product of the individual density functions $f(x\mid\mu)$:

 $\displaystyle f(\boldsymbol{x}\mid\mu)$ $\displaystyle=$ $\displaystyle\prod_{i=1}^{n}f(x\mid\mu)=\prod_{i=1}^{n}\frac{1}{\sqrt{2\pi% \sigma^{2}}}\exp\Big{[}-\frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\Big{]}$ (1) $\displaystyle=$ $\displaystyle\frac{1}{\sqrt{(2\pi)^{n}\sigma^{2n}}}\exp\Big{[}\sum_{i=1}^{n}-% \frac{(x_{i}-\mu)^{2}}{2\sigma^{2}}\Big{]}$ (2) $\displaystyle=$ $\displaystyle\frac{1}{\sqrt{(2\pi)^{n}\sigma^{2n}}}\exp\Big{[}\frac{-1}{2% \sigma^{2}}\sum_{i=1}^{n}x_{i}^{2}\Big{]}\exp\Big{[}\frac{\mu}{\sigma^{2}}\sum% _{i=1}^{n}x_{i}-\frac{n\mu^{2}}{2\sigma^{2}}\Big{]}$ (3) $\displaystyle=$ $\displaystyle h(\boldsymbol{x})\exp\Big{[}\frac{n\mu}{\sigma^{2}}T(\boldsymbol% {x})-\frac{n\mu^{2}}{2\sigma^{2}}\Big{]}$ (4) $\displaystyle=$ $\displaystyle h(\boldsymbol{x})g(T(\boldsymbol{x}),\mu)$ (5)

where $g$ is the last exponential expression and $h$ is the rest of the expression in $(3)$. By the factorization criterion, $T(\boldsymbol{X})=\overline{X}$ is a sufficient statistic.

2. 2.

Similarly, the above shows that the sample variance $s^{2}$ is not a sufficient statistic for $\sigma^{2}$ if $\mu$ is unknown.

3. 3.

But, if $\mu$ is a known constant, then the statistic

 $T(X_{1},\ldots,X_{n})=\frac{1}{n-1}\sum_{i=1}^{n}(X_{i}-\mu)^{2}$

is sufficient for $\sigma^{2}$ by observing in $(2)$ above, and letting $h(\boldsymbol{x})=1$ and $g(T,\sigma^{2})$ be all of expression $(2)$.

Title factorization criterion FactorizationCriterion 2013-03-22 15:02:48 2013-03-22 15:02:48 CWoo (3771) CWoo (3771) 4 CWoo (3771) Theorem msc 62B05 factorization theorem Fisher-Neyman factorization theorem