PlanetMath (more info)
 Math for the people, by the people. Sponsor PlanetMath
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Owner confidence rating: Very high Entry average rating: High
statistic (Definition)

A statistic, or sample statistic, $S$ is simply a function, usually real-valued, of a set of (sample) data or observations $\boldsymbol{X}=(X_1,X_2,\ldots,X_n)$ : $S=S(\boldsymbol{X})$ . More formally, let $\Omega$ be the sample space of the data $\boldsymbol{X}$ , then $S$ is a function from $\Omega$ to some set $T$ , usually a subset of $\mathbb{R}^k$ . The data $\boldsymbol{X}$ is usually considered as a vector of iid random variables $X_i$ .

Examples

  1. 100 light bulbs out of 1,000,000 are tested for their functionality. Then the number $n$ , of defective light bulbs in the 100 samples is a statistic. To see this, define, for each $i$ from 1 to 100,

    \begin{displaymath} x_i = \begin{cases} 1 & \text{if the event }X_i=\lbrace \tex... ...bulb is defective}\rbrace \ 0 & \text{otherwise.} \end{cases}\end{displaymath}
    Then $n=\sum_{i=1}^{100} x_i$ , a function of the data. Similarly, the number of operating light bulbs is also a statistic if we switch the 1 and 0 in the above definitions for the $x_i$ 's. If we make all $x_i=1$ , then $n$ is just the count of the observations, one of the simplest forms of sample statistics. If we make all $x_i=0$ , then $n=0$ is a statistic that is not at all useful.
  2. Let $w_1,w_2,\ldots,w_{20}$ be the weights of 20 students from a particular college. Then the average weight defined by $$\overline{w}=\frac{1}{20}\sum_{i=1}^{20}w_i$$ is a statistic. It is commonly called the sample mean. It is often used to estimate $\operatorname{E}[X]$ , the expectation of a particular random variable, which, in this case, is the weight of a student in the college. Of course, other averages, such as medians, mode, trimmed mean, are also examples of (sample) statistics.
  3. Using the same example as in (2), we can define $$s^2=\frac{1}{20-1}\sum_{i=1}^{20}(w_i-\overline{w})^2.$$ This is also a statistic, for, after some substitution and rewriting, $$s^2=\frac{1}{20-1}\Big[\sum_{i=1}^{20}{w_i}^2-\frac{1}{20}(\sum_{i=1}^{20}{w_i)}^2\Big],$$ which is a function explicitly in terms of the $w_i$ 's. This statistic is known as the sample variance, which is a common estimate of $\operatorname{Var}[X]$ , the variance of the random variable $X$ . Again, in this example, the $X$ is the weight of a student in the college.
  4. Again, borrowing from the same example above, we can simply order the weights of the 20 students in an ascending order, so we get a vector of 20 real numbers $(w_{(1)},w_{(2)},\ldots,w_{(20)})$ . This is also a statistic, called an order statistic. It is not real-valued and its range is a subset of $\mathbb{R}^{20}$ .
  5. Given a set of numeric observations $X_1,X_2,\ldots,X_n$ , without knowing the distribution of these observations, one can define what is known as the empirical distribution function $\hat{F}$ , which is a real-valued function, based on the observations. This is an example of a statistic whose range is a function space.

Remarks.

  • Any function of a statistic is again a statistic.
  • Since the underlying data is assumed to be random, a statistic is necessarily a random variable.
  • Although mostly real-valued, a statistic can be vector-valued, or even function-valued as we have seen in earlier examples.




"statistic" is owned by CWoo.
(view preamble | get metadata)

View style:

Also defines:  sample mean, sample variance
Log in to rate this entry.
(view current ratings)

Cross-references: even, function space, empirical distribution function, distribution, range, order statistic, real numbers, ascending order, order, variance, terms, substitution, trimmed mean, mode, medians, expectation, estimate, average, weights, useful, definitions, number, random variables, iid, vector, subset, observations, function
There are 50 references to this entry.

This is version 8 of statistic, born on 2004-10-24, modified 2007-09-20.
Object id is 6416, canonical name is Statistic.
Accessed 27155 times total.

Classification:
AMS MSC62A01 (Statistics :: Foundational and philosophical topics)

Pending Errata and Addenda
None.
[ View all 4 ]
Discussion
Style: Expand: Order:
forum policy

No messages.

Interact
post | correct | update request | add derivation | add example | add (any)