Sampling Mean, Variance, and the Law of Large Numbers

Sampling Mean (\(\bar{X}\))

The sampling mean is a measure of the "central tendency" of a dataset. It is calculated by summing all the observed values and dividing by the total number of observations. Mathematically:

\[ \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i \]

Sampling Variance (\(S^2\))

The sampling variance measures the spread of data, i.e., how far the observed values deviate from the sample mean. It is defined as:

\[ S^2 = \frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})^2 \]

The Law of Large Numbers (LLN)

The Law of Large Numbers states that as the sample size (\(n\)) increases, the sample mean (\(\bar{X}\)) approaches the true population mean (\(\mu\)).

Mathematical Statement

\[ \lim_{n \to \infty} P(|\bar{X}_n - \mu| < \epsilon) = 1, \quad \text{for any } \epsilon > 0. \]

This guarantees that, with enough observations, we can accurately estimate population parameters.

Illustrative Example

Suppose you roll a fair die. Each number (\(1, 2, 3, 4, 5, 6\)) has a probability of \(1/6\), and the theoretical mean is:

\[ \mu = \frac{1+2+3+4+5+6}{6} = 3.5 \]

Initially, the results of a few rolls might deviate significantly (e.g., \(1, 6, 2\)), but as the number of rolls increases, the average (\(\bar{X}\)) will converge to 3.5.

Applications in Cybersecurity

Exercise

Following the same scheme of HMWK 7 compute the distribution of the sampling variance ("corrected" or not). Determine the distribution of the variances of the samples, and its mean and variance. discussing the observed relationship with the mean and variance of the parent (theoretical) distribution.

The results include statistical summaries and visual representations to deepen understanding of these important statistical concepts.

Access the Exercise

You can access the full interactive exercise by clicking the link below:

Go to the Exercise