Sample Probability Problem

Problem

Suppose we are given a set consisting of different elements, and a map

on that set, which maps every item to either or .

The fraction of items that are mapped to is given by

Given a subset of size , define the sample proportion

Assume we select a subset uniformly at random from all subsets of size . Find the expectation value and the variance of the random variable .

Solution

Let us first compute the expectation and variance of a single sample value. Since takes values in :

Let us define the set of all subsets of size

where defines the powerset of , and the size of the set is

For example, if then

Now the expectation value and variance that we are required to compute are given in terms of:

To compute the sums that appear in the expressions above we will use arguments of symmetry to show that the terms on the left and right hand side will be the same up to an integer constant

where is a constant.

Counting the total number of terms on the left and the right it follows that

Similarly,

and counting the terms on the left and the right

Since is either or it can be shown that

Finally, we obtain simplified expressions for the expectations:

Thus, using the definition of variance , we obtain the following final result:

The term is known as the Finite Population Correction (FPC). It can be seen that this agrees with the variance for sampling with replacement in the limit , where :