Cutoff vs. Design-Based Sampling and Inference For Establishment Surveys

by James R. Knaub, Jr.

Abstract: Most sample surveys, especially household surveys, are design-based, meaning sampling and inference (e.g., means and standard errors) are determined by a process of randomization. That is, each member of the population has a predetermined probability of being selected for a sample. Establishment surveys generally have many smaller entities, and a relatively few large ones. Still, each can be assigned a given probability of selection. However, an alternative may be to collect data from generally the largest establishments some being larger for some attributes than others and use regression to estimate for the remainder. For design-based sampling, or even for a census survey, such models are often needed to impute for nonresponse. When such modeling would be needed for many small respondents, generally if sample data are collected on a frequent basis, but regressor (related) data are available for all of the population, then cutoff sampling with regression used for inference may be a better alternative. Note that with regression, one can always calculate an estimate of variance for an estimated total. (For example, see Knaub(1996), and note Knaub(2007d).)

Key Words: classical ratio estimator, conditionality principle, model failure, probability proportionate to size (PPS), randomization principle, regression, skewed data, superpopulation, total survey error

Author:
James R. Knaub, Jr., James.Knaub@eia.doe.gov

Editor: Suojin Wang, sjwang@PICARD.tamu.edu

READING THE ARTICLE: You can read the article in portable document (.pdf) format (509335 bytes.)

NOTE: The content of this article is the intellectual property of the authors, who retains all rights to future publication.

This page has been accessed 3084 times since JUNE 28, 2008.


Return to the InterStat Home Page.