Applying the Hájek Approach in Formula-Based Variance Estimation

Qian, Jiahe
Estimated Variance, Two-Stage Testing, Simulation, Cluster Sampling, Horvitz-Thompson Estimator, National Assessment of Educational Progress (NAEP), Probability, Monte Carlo Simulations, Sample Size


The variance formula derived for a two-stage sampling design without replacement employs the joint inclusion probabilities in the first-stage selection of clusters. One of the difficulties encountered in data analysis is the lack of information about such joint inclusion probabilities. One way to solve this issue is by applying Hájek’s approximation of the joint probabilities in variance estimation. To assess the Hájek approach, several estimators of Hájek’s c and d are proposed.The application is illustrated with simulation and real data. A Monte Carlo simulation is employed to compare the results of joint inclusion probabilities yielded from the probability-proportional-to-size sampling methods with the results from Hájek’s approximation. Empirically estimated variances from the jackknife procedure are also compared with the formula-based variances with incorporated Hájek’s approximation.

