## Cutoff vs. Design-Based Sampling and Inference For Establishment Surveys

### by James R. Knaub, Jr.

**Abstract:**
Most sample surveys, especially household surveys, are design-based, meaning sampling and inference (e.g., means and standard errors) are determined by a process of randomization. That is, each member of the population has a predetermined probability of being selected for a sample. Establishment surveys generally have many smaller entities, and a relatively few large ones. Still, each can be assigned a given probability of selection. However, an alternative may be to collect data from generally the largest establishments – some being larger for some attributes than others – and use regression to estimate for the remainder. For design-based sampling, or even for a census survey, such models are often needed to impute for nonresponse. When such modeling would be needed for many small respondents, generally if sample data are collected on a frequent basis, but regressor (related) data are available for all of the population, then cutoff sampling with regression used for inference may be a better alternative. Note that with regression, one can always calculate an estimate of variance for an estimated total. (For example, see Knaub(1996), and note Knaub(2007d).)
**Key Words: **
classical ratio estimator, conditionality principle, model failure, probability proportionate to size (PPS), randomization principle, regression, skewed data, superpopulation, total survey error

**Author:**

James R. Knaub, Jr., James.Knaub@eia.doe.gov

**Editor:**
Suojin Wang, sjwang@PICARD.tamu.edu

**READING THE ARTICLE:** You can read the article in
portable document (.pdf) format (509335 bytes.)

**NOTE: **The content of this article is the intellectual property of the authors, who retains all rights to future publication.

*This page has been accessed 3084 times since JUNE 28, 2008.*

Return to the Home Page.