IV. Sample Design

Webpage last modified: 2008-Sep-19

Introduction

Optimal sample design can be thought of as a probability sample design [see probability sampling] that maximizes the amount of information obtained per monetary unit spent. Different nations have different sampling resources and conditions. For a cross-cultural survey, this means that the optimal sample design for one country may not be the optimal design for another. Therefore, allowing each participating country flexibility in its choice of sample design is highly recommended, so long as all sample designs use probability methods at each stage of selection [6] [16]).

This chapter outlines the decisions that need to be made when designing a cross-cultural probability survey sample and encourages cross-cultural survey organizers to allow sample designs to differ among participating countries, but at the same time ensure standardization on the principles of probability sampling.

Please note that this chapter assumes that the reader has a basic understanding of statistics and terms such as variance and S2. Please refer to Further Reading or an introductory statistics textbook if a statistics refresher is needed.

Guidelines

Goal: To select a probability sample in each participating country that is representative of the target population and allows researchers to make inferences back to the target population, as well as to allow sample designs to vary among participating nations while maintaining probability samples.

  1. Choose the time dimension of the overall survey sample design.
    Rationale

    Sometimes the decision regarding whether the sample survey should collect data from selected elements at only one point in time or at more than one point in time is clear-cut. In many situations, however, the decision is not straightforward, and survey organizers are wise to consider the benefits and drawbacks of each method. This decision will affect all aspects of the survey, including the cost, level of effort and speed with which the results and analysis can be presented.

    Procedural steps
    • Choose to administer either a cross-sectional survey or one of the types of panel surveys.
      • Cross-Sectional Surveys: surveys where the data are collected from selected elements at one point in time.
      • Panel Surveys: surveys where the data are collected from selected elements at more than one point in time [2] [15] [18].
      • Advantages of Panel Surveys:
        • The ability to measure changes over time on the statistics of interest.
        • The ability to gather information from respondents when the information is fresh in their memories.
      • Disadvantages of Panel Surveys:
        • Difficulty in convincing respondents to participate across multiple waves of data collection (i.e., reduction in original sample members over time).
        • For surveys of mobile populations, the attrition rate to panel studies can be very high due to migrations and relocations. Survey planners are wise to consider how to clearly identify and track panel survey respondents when dealing with a mobile population.
        • Question wording and response options must stay consistent across waves in order to allow comparison over time on the statistic of interest.
    Lessons learned
    • Survey planners are sometimes naïve about the high cost and effort required to maintain a panel survey. When considering the implementation of a panel survey, it is wise to refer to the literature on longitudinal survey programs such as the Survey of Income and Program Participation [12], the British Household Panel Survey [19], the European Community Household Panel [21], and Canada's Survey of Labour and Income Dynamics [17]. This literature gives a clear sense of the effort and expense necessary to execute a panel survey, and can help survey planners make a more judicious decision regarding the time dimension of the survey design.
  2. Define the target population for the study across all countries and the survey population within each participating country/culture.
    Rationale

    The survey planners of any cross-cultural survey need to define a detailed, concise target population in order to ensure that each participating country aims to collect data from the same population. Without a precise definition, one country may collect data that include a certain subgroup, such as non-citizens, while another country may exclude this subgroup. This difference in sample composition may influence the estimates of key statistics across countries. In addition, a precise definition will let future users of the survey data know the exact population to which the survey data refer. The data users can then make a more informed decision about whether to include the survey data in their analyses.

    Procedural steps
    • Define the target population across all participating countries as clearly as possible, including the kind of units that are elements of the populations and the time extents of the group [8]. For example, a target population may be defined as, "All persons above the age of 18 who reside in housing units in South Africa, Zimbabwe, Lesotho and Swaziland at any time during April, 2007." (Note that this would in turn require explication of the definition of 'household.')
      • To ensure a clear description of the target population, think about all the potential inclusion/exclusion criteria. For example, the target population might exclude:
        • Immigrants, ethnic minorities, homeless or nomadic populations, language groups.
        • Persons in institutions, such as hospitals, nursing homes, prisons, group quarters, or military bases.
        • Persons living in certain geographic regions.
        • Persons outside a defined age range.
    • Define the survey population within each participating country by refining the target population based on the cost, security or access restrictions to all target population elements [8].
      • For example, the survey population may exclude those residing in war-torn areas, or the data collection period may be narrowed in areas with civil disturbances that are threatening to escalate.
    Lessons learned
    • One of the major interests of The World Mental Health (WMH) Study was comparison of the age of onset of disease across countries. Best practice might suggest strictly defining the age of majority (e.g., 18 years old). However, the WMH study organizers recognized that strictly defining this inclusion criteria would be difficult, given that age of majority varies by country (and even within a country), and a strict definition would affect study protocols such as ethics reviews and informed consent (seeking permission to interview minors). Therefore, the WMH Study had to make a difficult decision over whether to strictly define the inclusion age or allow it to vary across countries. In the end, the WMH Study allowed the age range to vary, and this was taken into consideration in the analysis stage.
  3. Identify and evaluate potential sampling frames. Select or create the sampling frame that best covers the target population given the country's survey budget.
    Rationale

    The ideal sampling frame enumerates all of the elements of the target population. However, very few sampling frames exist that allow access to every element in the target population. The goal is therefore to choose a sampling frame that allows access to the most elements in the target population, given the survey budget.

    Procedural steps
    • Allow the sample design, beginning with the selection of the sampling frame, to differ across participating countries.
      • Since participating countries will have different sampling resources, allowing the types of sample frames to differ across countries gives each country the freedom to use its own resources to select or create a sample frame that best covers the target population. However, it is important to build in checks that ensure probability sampling is used at each stage of selection [19].
    • Have each participating country identify available sampling frames among:
      • Pre-existing list sampling frames.
        • Telephone directories.
        • Official population registries.
        • Postal registries.
        • Electoral rolls.
        • Respondents or sample from another study.
        • Other list(s) of addresses, phone numbers or names.
      • Random Digit Dialing (RDD) frame [23].
      • Multistage area probability sampling frames of households.
    • Use area probability samples when no adequate pre-existing sampling frames for selecting households exist. Here we outline a simple two- stage area probability sample of households that is employed in many cross- cultural surveys. Many texts and documents provide detailed guidance regarding the development of area probability samples [14] [24]. Additional information can also be found in Appendix A.
      • Start by creating primary sampling units (PSUs) — geographic clusters that are often Census enumeration areas. The size of the geographic clusters would be large enough to contain a population that is heterogeneous with respect to the survey variables of interest, but small enough to realize the travel-related cost efficiencies of clustered sample observations.
      • Next, list the housing units (secondary sampling units (SSUs)) within selected PSUs and select a simple random sample or systematic sample of housing units from the list.
      • Maintain a uniform definition of what constitutes a housing unit. In 1998, the United Nations defined a housing unit as, "a separate and independent place of abode intended for habitation by a single household, or one not intended for habitation but occupied as living quarters by a household at the time of the census. Thus it may be an occupied or vacant dwelling, an occupied mobile home or improvised housing unit, or any other place occupied as living quarters by a household at the time of the census. This category thus includes housing of various levels of permanency and acceptability [22]."
      • A commonly used definition in the United States is "a physical structure intended as a dwelling that has its own entrance separate from other units in the structure and an area where meals may be prepared and served [7]."
      • Some cross- cultural surveys have used satellite technology to help identify and list households, settlements and habitations, especially those in hard to find areas such as mountainous, riverine and creek regions [20].
    • When the frame is a list of housing units, such as an area probability sample or a list of addresses, list the eligible persons within each selected housing unit.
      • Choose a housing unit residency rule to identify eligible respondents within each housing unit. Similar to defining a target population, once the rule is defined, it should be consistent across all participating countries. Choose between:
        • De facto residence rule — persons who slept in the housing unit the previous night.
          • Advantage: Easy to remember.
        • De jure residence rule — persons who "usually" sleep in housing unit.
          • Advantage: A better representation of the typical residents of a housing unit.
    • Evaluate how each potential sampling frame covers the target population [7]. (For more information, refer to Appendix A.)
      • Examine the sampling frame for a one- to-one mapping between the elements on the sampling frame and the target population. There are four potential problems.
        • Undercoverage (missing elements): elements in the target population do not appear on sampling frame.
        • Ineligible elements: elements on sampling frame that do not exist in target population.
        • Duplication: several sampling frame elements match to one target population element.
        • Clustering: one sampling frame element matches to many target population elements.
      • If multiple frames are chosen, consider the following possibilities for combining frames:
        • First, determine for each element on the combined frame whether it is a member of Frame A only, Frame B only, or both Frame A and B.
          • If the membership of each element can be determined before sampling, then duplicates can be removed from the sampling frame.
          • A variation on this is to use a rule that can be applied to just the sample, rather than to the entire frame. Frame A might be designated the controlling frame, in the sense that a unit that is in both frames is allowed to be sampled only from A. After the sample is selected, determine whether each unit from B is on the A frame, and retain only the unit only if it is not on frame A. This method extends to more than two frames by assigning a priority order to the frames.
          • If the membership cannot be determined prior to sampling, then elements belonging to both frames can be weighted for unequal probabilities of selection after data collection (see Data Processing and Statistical Adjustment for best practices for weighting and nonresponse adjustments).
      • Evaluate the cost of obtaining or creating each potential sampling frame.
        • It is most often cheaper to purchase pre-existing lists than to create area probability frames. However, the purchase cost is often trivial compared to the costs of performing the survey interviews. Area probability samples are more costly to develop, but they facilitate cost-effective clustering for interviews. Sampling named individuals from registries requires tracking of potential respondents who no longer live at the address listed on the registry, or who have changed phone numbers. Tracking these individuals can be very expensive.
    • Select the sampling frame based on the coverage error vs. cost tradeoff.
    • Update the frame, if necessary. World Health Survey (WHS) administrators have suggested that frames that are two years or more out of date require updating [24].
      • If the frame is a pre-existing list, contact the provider of the list for the newest version.
      • If the frame is an area probability sample and the target population has undergone extensive movement or substantial housing growth since the creation of the frame, then updating the PSUs and SSUs will be important. However, what is most important is the quality of the enumerative listing.
      • Remove any duplicates or ineligible units.
    Lessons learned
    • Many countries do not have complete registers of the population and therefore construct area probability frames for sample selection. Some surveys in majority countries have shown that it can be difficult to enumerate the rural, poor areas [1] [5] [11]1 and consequently surveys in these countries may under-represent poorer or more rural residents. If the statistic of interest is correlated with income and/or urbanicity, the sample estimate will be biased. For example, the Tibet Eye Care Assessment, a study on blindness and eye diseases in the Tibet Autonomous Region of China, used an area probability design to develop the sampling frame [5]. One of the PSUs was the township of Nakchu, an area of high elevation that is primarily populated by nomadic herders. Because of the elevation and rough terrain, Nakchu proved difficult to enumerate accurately. As a result, the survey sample underrepresented the residents of the roughest terrain of Nakchu. This was potentially important, as ophthalmologists believe that Tibetans who live in the most inaccessible regions and the highest elevation have the highest prevalence of eye disease and visual impairment.
  4. Choose a selection procedure that will randomly select elements from the sampling frame and assure that important subgroups in the population will be represented.
    Rationale

    Sample selection is a crucial part of the survey lifecycle. Since we cannot survey every possible element from the target population, we must rely on probability theory to be able to make inferences from the sample back to the target population.

    Procedural steps
    • Consider only selection methods that will provide a probability sample.
      • Theory for estimating sampling errors has been developed for probability samples that apply to any type of population. By ensuring a random selection of samples, probability methods also protect a researcher against accusations that his/her conscious or unconscious bias affected the selection.
      • Creating a frame where each element has a known, nonzero probability of selection can, in some cases, be very costly in terms of both time and effort. To reduce costs, some survey organizations select non-probability samples such as convenience samples (units are selected at the convenience of the researcher, and no attempt is made ensure that the sample accurately represents the target population) or quota samples. Upon the completion of data collection with such a sample, the survey organization typically calculates population estimates, standard errors and confidence intervals as though a probability sample had been selected. In using a non-probability method as a proxy for a probability method, the survey organization makes the assumption that the non-probability sample is unbiased. While not all non-probability samples are biased, the chance of bias is extremely high and, most importantly, cannot be measured. A survey that uses a non-probability sampling method cannot estimate the true error in the sample estimates [9].
    • Consider the best possible probability selection methods available. (See Appendix B for additional information about each selection method.)
      • Simple Random Sampling (SRS) without replacement: each element on the frame has an equal probability of selection, and each combination of n elements has the same probability of being selected.
        • Advantages of SRS: The procedure is easy to understand and implement.
        • Disadvantages of SRS: The costs in attempting to interview a simple random sample of persons can be quite high, and an SRS provides no assurance that important subpopulations will be included in the sample.
      • Systematic Sampling (a common way to considerably reduce the operational effort needed to select the sample): the selection of every kth element on the sampling frame after a random start.
        • Advantages of systematic sampling:
          • Substantial reduction of the operational time necessary to select the sample.
          • If the sampling frame is sorted in groups prior to selection, the systematic sampling method will select a proportionately allocated sample (see description below of stratified sampling). This is often referred to as "implicitly stratified sampling."
        • Disadvantages of systematic sampling:
          • If the sampling frame is sorted in cycles of values of the survey variables (e.g., 2, 4, 6, 2, 4, 6...) and the selection interval coincides with a multiple of the length of the cycle, the systematic sampling method will not perform well [11].
          • If the list is sorted in a specific order before selection, the repeated sampling variance of estimates cannot be computed exactly.
      • Stratified Sampling (see Appendix B for a detailed description).
        • Purpose: To help ensure that specified subpopulations are represented in the sample.
        • Advantages of stratified sampling:
          • Depending on the allocation of elements to the strata, the method can produce gains in precision (i.e., decrease in sampling variance) for the same cost, by making certain that essential subpopulations are included in the sample.
          • Virtually all practical sampling uses some form of stratification. (No disadvantages?)
      • Cluster Sampling (see Appendix B for a detailed description).
        • When survey populations are spread over a wide geographic area and interviews are to be done face-to-face, it can be very costly to create an element frame and visit n units randomly selected over the entire area.
        • Select clusters of frame elements jointly, rather than selecting individual elements one at a time.
        • The only population elements listed are those for the selected clusters.
        • Advantages of cluster sampling:
          • Costs less than SRS.
          • Allows sampling to be done when a full frame of elements is not available for the entire population.
        • Disadvantage of cluster sampling: estimates are not as precise as with SRS, necessitating a larger sample size in order to get the same level of precision.
      • Two-Phase (or Double) Sampling (see Appendix B for a further description).
        • The concept of two-phase sampling is to sample elements, measure one or more variables on these 1st-phase elements, and use that information to select a 2nd-phase subsample. A common application is to collect 1st-phase data that is used to stratify elements for the 2nd-phase subsample.
        • Survey samplers use two-phase sampling to help reduce nonresponse, with the stratifying variable from phase one being whether the person responded to the initial survey request. For example, samplers might select a subsample of nonrespondents and try to entice the nonrespondents to participate by offering incentives.
      • Replicated (or Interpenetrated) Sampling: a method in which "the total sample is made up of a set of replicate subsamples, each of the identical sample design [11].".
        • Advantages of using replicated sampling:
        • Disadvantages of using replicated sampling:
          • In face-to-face surveys, random assignment of interviewers to areas, rather than assignment to geographically proximal areas of the country, can lead to very large increases in survey costs.
          • Loss in precision of sampling variance estimators: a small number of replicates leads to a decrease in the number of degrees of freedom when calculating confidence intervals.
      • Combination of techniques — Stratified multistage cluster design.
    Lessons learned
    • Requiring full probability sampling at every stage of selection in a cross-cultural study can present challenges. Some survey organizations have a history of using quota sampling and loosely-controlled respondent substitution. Probability sampling at every stage generally requires more labor and funding than other methods. Therefore, some cross-cultural studies have only used probability sampling in the first stage of selection, and then allowed quota sampling or substitution to occur at later stages [3] [9]. For example, the World Value Study states that full probability samples are preferred, but quota sampling is also allowed at the household level.
    • As discussed in the Lessons Learned section of Guideline 2, the decision to stray from full probability sampling again reflects the conflict between standardization and flexibility in cross-cultural surveys. However, it bears repeating that without probability sampling, one cannot make justifiable inferences about the target population from the sample estimates.
  5. Determine the sample size necessary to meet the desired level of precision for the statistics of interest at population or subpopulation levels for the different potential sample selection procedures.
    Rationale

    After planning the sample design to be used, and before selecting the sample from the survey frame, the sample size must be determined. The sample size takes into account the desired level of precision for the statistic(s) of interest, estimates of the statistic of interest from previous surveys, the design effect, and estimated outcome rates of the survey. See [19] for a detailed treatment of the approach used in the European Social Survey. For a more extensive example of sample size calculation, see Appendix C.

    Procedural steps
    • Have the survey sponsor specify the desired level of precision. Practical experience has determined that often it is easiest for sponsors to conceptualize desired levels of precision in terms of 95% confidence intervals.
    • Convert these 95% confidence intervals into a sampling variance of the mean or proportion.
    • Obtain an estimate of S2 (population element variance).
      • If the statistic of interest is not a proportion, find an estimate of S2 from a previous survey on the same target population or a small pilot test.
      • If the statistic of interest is a proportion, the sampler can use the expected value of the proportion (p), even if it is a guess, to estimate S2 by using the formula s2= p(1-p).
    • Estimate the needed number of completed interviews for a SRS by dividing the estimate of S2 by the sampling variance of the mean. See [4] for more on sample size computation for SRS.
    • Multiply the number of completed interviews by the design effect to account for a non-SRS design.
    • Calculate the necessary sample size by dividing the number of completed interviews by the expected response rate, eligibility rate, and coverage rate.
      • The sampler can estimate these three rates by looking at the rates obtained in previous surveys with the same survey population and survey design.
    Lessons learned
    • Prior to the first implementation of the European Social Survey (ESS), many of the participating survey organizations had never encountered the ideas of sample size determination and calculating design effects [19]. Therefore, the ESS expert sampling panel spent considerable time explaining these concepts. In return, the organizations that were new to these methods were very enthusiastic to learn about them, and eager to meet the standards of the coordinating center. In fact, after completing Round 1 of the study, many nations commented that the designing the sample was one of the most educational aspects of the entire survey process, and had significantly improved the survey methods within their country.
  6. Document each step of the sample selection procedure.
    Rationale

    Over the course of many years, various researchers will analyze the same survey data set. In order to provide these different users with a clear sense of how and why the data were collected, it is critical that all properties of the data set be documented. In terms of the sample design and selection, the best time to document is generally shortly after sample selection, when the information regarding sample selection is fresh in one's mind.

    Procedural steps
    • Have the participating countries document the sample selection procedure while selection is occurring or shortly thereafter. Ideally, set a deadline that specifies the number of days after sample selection by which each participating country must send sampling selection documentation to the host survey organization. Be sure to allow for appropriate time to review and revise documentation when setting the deadline. (See Tenders, Bids, and Contracts.)
    • Include within the information the following topics:
      • A clear definition of the survey population, as well as the differences between the target population and survey population.
      • The sampling frame:
        • Both the sampling frame used and the date the frame was last updated, if the frame is a registry or list.
        • A description of the development of the sampling frame and the frame elements.
        • The information available on the frame.
    • For each sample, indicate how many stages were involved in selecting the sample (include the final stage in which the respondent was selected within the household (if applicable)), and a description of each stage, including how many sampling units were selected at each stage.
      • Examples of different stages include:
        • State/province.
        • County or group of counties.
        • City/town, community, municipality.
        • Enumeration district.
        • Area segment/group of neighborhood blocks.
        • Housing unit/physical address (not necessarily the postal address).
        • Postal delivery point/address.
        • Block of telephone numbers (e.g., by regional prefix).
        • Telephone number.
        • Household.
        • Person selected from a household listing.
        • Named person selected from a list, registry or other source that was not a household listing.
      • Examples of how units were selected:
        • All selected with equal probability.
        • All selected with probability proportional to size; specify the measure of size used. (See Appendix C for more on probability proportional to size sampling methods.)
        • Some units selected with certainty, others selected with probability proportional to size; describe the criteria used for certainty selection.
        • Census/enumeration (all units selected with certainty).
        • Units selected using a non-probability method (e.g., convenience sample, quota sample).
      • At each stage of selection, describe the stratifying variables and reasons for choosing these variables. Some examples of commonly used stratifying variables are:
        • Age.
        • Region of the country.
        • State/province.
        • County.
        • City/town, community, municipality.
        • Postal code.
        • Metropolitan status/urbanicity.
        • Size of sampling unit (e.g., population of city).
        • Race/ethnicity.
        • National origin (e.g., Mexican, Nigerian).
      • At each stage of selection, explain the allocation method used and the sample size for each stratum at each stage of selection. (See Appendix C for more on allocation methods in stratified sampling.)
    • If systematic sampling was used at any stage of selection, indicate whether the frame was sorted by any variables prior to systematic selection in order to achieve implicit stratification. If this is the case, describe the variable(s).
    • Describe the time dimension of the design (i.e., one-time cross-sectional, fixed panel, rotating design).
      • If a panel study:
        • State how many previous waves or rounds of data collection there have been for this panel study.
        • Describe the initial sample design for the panel study and any subsequent modifications to the design that are important in documenting this study.
      • If a rotating panel design:
        • Fully describe the rotating panel design for the study (e.g., fresh cross-section is drawn each month and respondents are interviewed once that month, and then re-interviewed once six months later).
        • State the anticipated precision in the estimates.
        • Explain any problems during the sampling process and any deviations from the sampling plan during implementation.
      • Additional sampling documentation:
        • Report any (additional) subsampling of eligible respondents, carried out in order to control the number of interviews completed by respondents with particular characteristics (e.g., one in two eligible males was interviewed, one in four eligible persons with no previous history of depression was interviewed (describe protocol)).
        • Describe any use of replicates (see Data Processing and Statistical Adjustment).
        • Explain if releases (non-random subsets of total sample) ) were used or the entire sample was released to data collection staff at the start of the study.
        • Recount in detail any substitution or replacement of sample during data collection.
    Lessons learned
    • As the procedural steps outlined above show, selecting a sample can involve many detailed steps that can be hard to recall after the fact. For example, the coordinating center for the World Mental Health Survey began gathering sampling documentation for weighting and other purposes after many of the participating countries had finished data collection, and found that some countries had a difficult time recalling all the necessary details, such as the sample size for each stratum at each stage of selection. It is wise to document sampling procedures in detail shortly after sample selection.

Appendix A

Appendix B

Appendix C

Footnotes

1 Not all survey methodologists agree with the opinions expressed by these authors regarding enumeration in rural, poor areas. Those who disagree argue that the poor enumerations are mainly due to low expectations and insufficient training and supervision.

Glossary

Convenience sample
A sample of elements that are selected because it is convenient to use them, not because they are representative of the target population.
Coverage
The proportion of the target population that is accounted for on the sampling frame.
Coverage rate
The number of elements on the sampling frame divided by the estimated number of elements in the target population.
Design effect
The impact of the complex survey design on sampling variance measured as the ratio of the sampling variance under the complex design to the sampling variance computed as a simple random sample.
Element
A single unit of the sampling frame.
Eligibility Rate
The number of eligible sample elements divided by the total number of elements on the sampling frame.
Fixed panel design
A longitudinal study which attempts to collect survey data on the same sample elements at intervals over a period of time. After the initial sample selection, no additions to the sample are made.
Fixed panel plus births design
A longitudinal study in which a panel of individuals is interviewed at intervals over a period of time and additional elements are added to the sample.
Interviewer Variance
That component of overall variability in survey statistics that can be accounted for by the interviewers.
Majority Country
A country with low per capita income (the majority of countries).
Nonresponse
A failure to elicit responses from sample persons due to lack of contact or cooperation.
Panel survey
A survey in which data are obtained from the same respondents over time.
Primary Sampling Unit(PSU)
A unit sampled at the first stage of selection.
Probability proportional to size
A sampling method that assures that sample estimates of totals or percentages (e.g. the estimate of the percentage of men living in Mexico based on the sample) equal population totals or percentages (e.g. the estimate of the percentage of men living in Mexico based on Census data). The adjustment cells for postratification are formed in a similar way as strata in sample selection, but variables can be used that were not on the original sampling frame at the time of selection.
Probability sampling
A sampling method where each element on the sampling frame has a known, non-zero chance of selection.
Quota Sampling
A non-probability sampling method that sets specific sample size quotas or target sample sizes for subclasses of the target population. The sample quotas are generally based on simple demographic characteristics, (e.g., quotas for gender, age groups and geographic region subclasses).
Random-digit-dialing (RDD)
A method of selecting telephone numbers in which the target population consists of all possible telephone numbers, and all telephone numbers have an equal probability of selection.
Repeated panel design
A series of fixed panel surveys that may or may not overlap in time. Generally, each panel is designed to represent the same target population definition applied at a different point in time.
Replicates
Probability subsamples of the full sample design
Residency rule
A rule to help interviewers determine which persons to include in the household listing, based on what the informant reports.
Response rate
The number of completed interviews divided by the total estimated number of eligible sample persons.
Rotating panel design
A study where elements are repeatedly measured a set number of times, then replaced by new randomly chosen elements. Typically, the newly-chosen elements are also measured repeatedly for the appropriate number of times.
Sampling frames
Lists or materials used to identify all elements (e.g., persons, households, establishments) of a survey population from which the sample will be selected. These lists or materials can include maps of areas in which the elements can be found, lists of members of a professional association, and registries of addresses or persons.
Sampling units
Elements or clusters of elements considered for selection in some stage of sampling. For a sample with only one stage of selection, the sampling units are the same as the elements. In multi-stage samples (e.g., enumeration areas, then households within selected enumeration areas, and finally adults within selected households), different sampling units exist, while only the last is an element. The term primary sampling units (PSUs) refers to the sampling units chosen in the first stage of selection. The term secondary sampling units (SSUs) refers to sampling units within the PSUs that are chosen in the second stage of selection.
Sampling variance
A measure of the variability of the sample estimates of a population parameter, if all possible samples of the same size were selected from the sampling frame.
Secondary Sampling Unit (SSU)
A unit sampled at the second stage of selection.
Split panel design
A design that contains a blend of cross-sectional and panel samples at each new wave of data collection.
Strata
Non-overlapping groups that comprise all of the elements on the sampling frame.
Substitution
A technique where each nonresponding sample element from the initial sample is replaced by another element of the target population, typically not an element selected in the initial sample.
Survey population
The actual population from which the survey data are collected, given the restrictions from data collection operations.
Target population
The finite population for which the survey sponsor wants to make inferences using the sample statistics.

References

[1] Bergsten, J. W. (1980). Some sample survey designs in Syria, Nepal and Somalia. Paper presented at the Proceedings of the Survey Research Methods Section, American Statistical Association.

[2] Binder, D. (1998). Longitudinal surveys: Why are these surveys different from all other surveys? Survey Methodology, 24(2), 101-108

[3] Chikwanha, A. B. (2005). Conducting surveys and quality control in Africa—Insights from the Afrobarometer. Paper presented at the WAPOR/ISSC Conference.

[4] Cochran, W. G. (1977). Sampling Techniques. New York: Wiley & Sons.

[5] Dunzhu, S., Wang, F. S., Courtright, P., Liu, L., Tenzing, C., Noertjojo, K., et al. (2003). Blindness and eye diseases in Tibet: Findings from a randomised, population based survey. Br J Ophthalmol 87(12), 1443-1448.

[6] Häder, S., & Gabler, S. (2003). Sampling and estimation. In J. Harkness et al. (Eds.), Cross-Cultural Survey Methods. New York: Wiley.

[7] Groves, R. M. (1989). Survey Errors and Survey Costs. Hoboken, NJ: Wiley & Sons.

[8] Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., & Tourangeau, R. (2004). Survey Methodology. Hoboken, NJ: Wiley & Sons.

[9] Heeringa, S. G., & O'Muircheartaigh, C. Sampling design for cross-cultural and cross-national studies. Paper to be presented at 3MC Conference, Berlin, Germany, June 2008.

[10] Inglehart, R. (1997). Modernization and postmodernization: Cultural, economic and political change in 43 societies. Princeton, NJ: Princeton Univ. Press

[11] Kalton, G. (1983). Introduction to Survey Sampling. Newbury Park, CA: Sage Publications.

[12] Kasprzyk, D. (1988). The Survey of Income and Program Participation: An overview and discussion of research issues. Washington, DC: U.S. Bureau of the Census.

[13] Kish, L. (1949). A procedure for objective respondent selection within the household. Journal of the American Statistical Association, 44, 380-387.

[14] Kish, L. (1965). Survey Sampling. New York: Wiley & Sons.

[15] Kish, L. (1987). Statistical Design for Research. New York: Wiley & Sons.

[16] Kish, L. (1994). Multipopulation survey designs: Five types with seven shared aspects. International Statistical Review, 62, 167—186.

[17] Lavallée, P., Michaud, S., & Webber, M. (1993). The Survey of Labour and Income Dynamics, Design issues for a new longitudinal survey in Canada. Bulletin of the International Statistical Institute, 49th Session, Contributed Papers, Book 2, 99-100.

[18] Lynn, P. (2005). Longitudinal surveys methodology. Retrieved May 23, 2008, from http://www.eustat.es /prodserv/datos/Sem45_i.pdf.

[19] Lynn, P., Häder, S., Gabler, S., & Laaksonen, S. (2007). Methods for achieving equivalence of samples in cross-national surveys: The European Social Survey Experience. Journal of Official Statistics, 23(1), 107—124.

[20] Okafor, R., Adeleke, I., & Oparac, A. (2007). An appraisal of the conduct and provisional results of the Nigerian Population and Housing Census of 2006. Paper presented at the Proceedings of the Survey Research Methods Section, American Statistical Association.

[21] Peracchi, F. (2002). The European Community Household Panel: A review. Empirical Economics, vol. 27, 63-90.

[22] Principles and recommendations for population and housing censuses, revision 1 (1998), para. 2.330. New York: United Nations.

[23] Tucker, C., Lepkowski, J. M., & Piekarski, L. (2002). The current efficiency of list assisted telephone sampling designs. Public Opinion Quarterly, 66, 321-38.

[24] Üstun, T. B., Chatterji, S., Mechbal, A., & Murray, C. J. L. (2005). Chapter X: Quality assurance in surveys: Standards, guidelines, and procedures. In United Nations Statistical Division, United Nations Department of Economic and Social Affairs (Eds.), Household Surveys in Developing and Transition Countries. New York: United Nations.

[25] Yansaneh, I. (2005). Chapter 2: Overview of sample design issues for household surveys in developing and transition countries. In United Nations Statistical Division, United Nations Department of Economic and Social Affairs (Eds.), Household Surveys in Developing and Transition Countries. New York: United Nations.

Further Reading

Sampling

Cochran, W. G. (1977). Sampling Techniques. New York: Wiley & Sons.

Kalton, G. (1983). Introduction to Survey Sampling. Newbury Park, CA: Sage Publications.

Kish, L. (1965). Survey Sampling. New York: Wiley & Sons.

Statistics

Snedecor, G. W., & Cochran, W. G. (1989). Statistical Methods, Eighth Edition, Iowa State University Press.

Return to top

Previous chapter | Next chapter | Home

© 2008 The authors of the Guidelines hold the copyright. Please contact us if you wish to publish any of this material in any form.