A statistic, or sample statistic, is simply a function, usually real-valued, of a set of (sample) data or observations : . More formally, let be the sample space of the data , then is a function from to some set , usually a subset of . The data is usually considered as a vector of iid random variables .
100 light bulbs out of 1,000,000 are tested for their functionality. Then the number , of defective light bulbs in the 100 samples is a statistic. To see this, define, for each from 1 to 100,
Then , a function of the data. Similarly, the number of operating light bulbs is also a statistic if we switch the 1 and 0 in the above definitions for the ’s. If we make all , then is just the count of the observations, one of the simplest forms of sample statistics. If we make all , then is a statistic that is not at all useful.
Let be the weights of 20 students from a particular college. Then the average weight defined by
is a statistic. It is commonly called the sample mean. It is often used to estimate , the expectation of a particular random variable, which, in this case, is the weight of a student in the college. Of course, other averages, such as medians, mode, trimmed mean, are also examples of (sample) statistics.
Using the same example as in (2), we can define
This is also a statistic, for, after some substitution and rewriting,
which is a function explicitly in terms of the ’s. This statistic is known as the sample variance, which is a common estimate of , the variance of the random variable . Again, in this example, the is the weight of a student in the college.
Any function of a statistic is again a statistic.
Since the underlying data is assumed to be random, a statistic is necessarily a random variable.
Although mostly real-valued, a statistic can be vector-valued, or even function-valued as we have seen in earlier examples.
|Date of creation||2013-03-22 14:46:18|
|Last modified on||2013-03-22 14:46:18|
|Last modified by||CWoo (3771)|