XV. Assessing the Quality of Cross-Cultural Surveys

Webpage last modified: 2009-Mar-03

Introduction

In mono-cultural surveys, assessing the quality of survey data requires adequate documentation of the entire survey lifecycle and an understanding of protocols used to assure quality. The assessment procedures and criteria become more complicated in cross-national research, which — in addition to methodological, organizational and operational barriers to the implementation of quality monitoring and producing documentation — may often include additional production processes, such as adaptation and translation of questions and pretesting in diverse contexts.

The framework adopted by these guidelines for assessing quality is informed by research on quality management, which defines quality in terms of three main criteria: fitness for use [8], total survey error [6], and survey process quality [9].

Discussions of quality often focus on fitness for use, survey error, or both. Fitness for use is the extent to which statistics meet the requirements of users of the data. Statistical agencies commonly apply this criterion when defining quality. The dimensions of quality that national statistical agencies use to define quality vary to some extent, but there is agreement on several key dimensions. For example: Are concepts, constructs, indicators, and measures relevant and appropriate for addressing research questions of interest? Are the data accurate? Accuracy is generally what is addressed in discussion of total survey error, that is, accuracy of a survey statistic in terms of construct validity, measurement, and representativeness.

Figure 1 shows seven dimensions of quality that are often used to assess the quality of official statistics (see for example, [2] [4]):

  1. Relevance — do the data meet the needs of the client or users? In a multi-cultural context, where there will be many clients or users with possibly competing goals, the dimension of relevance becomes more challenging to fulfill.
  2. Accuracy — are the data describing the phenomena that they were designed to measure? Accuracy refers to the sources of survey error [6] (e.g., sampling error, measurement error, nonresponse error, etc.; see Figure 2).
  3. Timeliness — how much time has elapsed between the end of the data collection and when the data are available for analysis? One example of the challenge of providing timely data in cross-national research would be a political election study in nations with elections occurring in different time periods.
  4. Accessibility — can data be easily obtained by users? In the cross-national context, data access can mean more than simply making data publicly available. Access, particularly in majority countries, may also need to include capacity building or training activities to make the data truly accessible to local populations. Country-level data access laws and regulations will also come into play.
  5. Interpretability — are supplementary data available to analysts that describe the major characteristics and structure of the data (metadata) as well as data about the survey processes (paradata)?
  6. Coherence — are the data available for further recombination with other statistical information for various secondary purposes, and
  7. Comparability — to what extent are observed data differences due to genuine variation as opposed to other factors? The quality dimensions of coherence and comparability are the raison d'être for cross-national and cross-cultural survey research.

Appendix A provides a list of guidelines from specific modules, highlighting recommendations in relation to these dimensions of quality. Fitness for use on these dimensions may be affected further by high or low cost, burden, design constraints, and professionalism:

  1. Cost — to what extent did cost play a factor in implementation decisions?
  2. Burden — were the concerns of burden to respondents adequately considered?
  3. Design Constraints — were there context-specific constraints on study design that may have had an impact on quality (for example, using a different mode of interview in one survey implementation than in others)?
  4. Professionalism — are staff provided with clear behavioral guidelines and professional training, are there adequate provisions to ensure compliance with relevant laws, and is there demonstration that analyses and reporting have been impartial?

In order to provide documentation that allows users to determine the fitness of data for use, and enough information to assess the accuracy of data and their comparability across cultures, data producers need to provide documentation of process quality management efforts. There has been a move from postsurvey evaluation of quality to the use of monitoring and control during the survey process to ensure data quality [1]. Use of a process quality approach requires the use of quality standards, management for quality, and collection of standardized study metadata, question metadata, and process paradata [3]. Figure 3 shows the elements of process quality management that allow users to assess process quality, which are:

Figure 1: Fitness for Use — Dimensions of Quality

Fitness for Use

Figure 2. Sources of Error That Affect Accuracy as a Dimension of Quality

Sources of Error

A quality profile synthesizes information from other sources, documenting all aspects of the survey, providing indicators of process quality, sources of sampling and nonsampling error, and recommendations for improvement and further research. It provides the user all information available to help assess data quality in terms of fitness of use, survey error, and other factors. See [5] for one set of guidelines for such reports, and [7] and [10] for examples of actual quality profiles.

Figure 3. The Elements of Process Quality Management

Elements of Process Quality Management

Organizations and projects will vary in cost-quality tradeoffs that are made, as well as items that will be monitored for quality purposes. However, if each organization in a cross-cultural study provides standardized quality profiles with adequate information, users will be able to assess quality and comparability across cultures.

Except for Guidelines modules still under construction, Appendix B summarizes for each module recommended elements of quality planning and assurance, quality monitoring and control, and a quality profile.

Appendix A

Appendix B

Glossary

Bias
A systematic difference between the survey estimate of the population parameter and the true value in the population.
Coding
Translating nonnumeric data into numeric fields.
Contact rate
The proportion of all cases in which some responsible member of the housing unit was reached by the survey.
Context effects
The impact of question context, such as the order or layout of questions, on survey responses.
Coversheet
Electronic or printed materials associated with each case that identify information about the case, e.g., the sample address, the unique identification number associated with a case, and the interviewer to whom a case is assigned. The coversheet often also contains an introduction to the study, instructions on how to screen sample members and randomly select the respondent, and space to record the date, time, outcome, and notes for every attempt.
Coverage Error
Survey error (variance or bias) that is introduced when some units in the target population are not included on the sampling frame.
Disposition code
A code that indicates the result of a specific call attempt or the outcome assigned to a sample element at the end of data collection (e.g., noncontact, refusal, ineligible, complete interview).
Element
A single unit of the sampling frame.
Imputation
Computational methods that assign one or more estimated answers for each item that previously had missing, incomplete or implausible data.
Hours Per Interview (HPI)
A measure of study efficiency, calculated as the total number of interviewer hours spent during production (including travel, reluctance handling, listing, completing an interview, and other administrative tasks) divided by the total number of interviews.
Majority country
A country with low per capita income (the majority of countries).
Measurement error
Survey error (variance or bias) due to the measurement process; that is, error introduced by the survey instrument, the interviewer, or the respondent.
Metadata
Data that describes other data. The term encompasses a broad spectrum of information about the survey, from study title to sample design, details such as interviewer briefing notes, contextual data and/or information such as legal regulations, customs, and economic indicators.
Mode
Method of data collection.
Nonresponse error
Error (variance or bias) that is introduced when not all sample members participate in the survey (unit nonresponse) or not all survey items are answered (item nonreponse) by a sample member.
Nonresponse bias
Bias that is introduced when not all sample members participate in the survey or answer a survey item and those that do not (the nonrespondents) differ from the respondents on the measure of interest.
Paradata
Process data collected during data collection, such as timestamps, keystrokes, interviewer observations, etc.
Poststratification (adjustment)
A statistical adjustment that assures that sample estimates of totals or percentages (e.g. the estimate of the percentage of men living in Mexico based on the sample) equal population totals or percentages (e.g. the estimate of the percentage of men living in Mexico based on Census data). The adjustment cells for poststratification are formed in a similar way as strata in sample selection, but variables can be used that were not on the original sampling frame at the time of selection.
Probability sampling
A sampling method where each element on the sampling frame has a known, non-zero chance of selection.
Proxy interview
An interview with anyone other than the person about whom information is being sought (e.g., parent, spouse).
Quality
Achieving excellence for all components related to the data.
Quality assurance
Statement of confidence that quality requirements will be fulfilled.
Quality control
Process focused on fulfilling quality requirements.
Replicates
Probability subsamples of the full sample design
Response rate
The number of completed interviews divided by the total estimated number of eligible sample persons.
Sample management system
A computerized and/or paper-based system used to assign and monitor sample cases and record documentation for sample records (e.g., time and outcome of each contact attempt).
Sampling error
Survey error (variance or bias) due to observing a sample of the population rather than the entire population.
Sampling frame
Lists or materials used to identify all elements (e.g., persons, households, establishments) of a survey population from which the sample will be selected. These lists or materials can include maps of areas in which the elements can be found, lists of members of a professional association, and registries of addresses or persons.
Survey estimate
The value yielded by a survey.
Survey population
The actual population from which the survey data are collected, given the restrictions from data collection operations.
Target population
The finite population for which the survey sponsor wants to make inferences using the sample statistics.
(Sampling) Units
Elements or clusters of elements considered for selection in some stage of sampling. For a sample with only one stage of selection, the sampling units are the same as the elements. In multi-stage samples (e.g., enumeration areas, then households within selected enumeration areas, and finally adults within selected households), different sampling units exist, while only the last is an element. The term primary sampling units (PSUs) refers to the sampling units chosen in the first stage of selection. The term secondary sampling units (SSUs) refers to sampling units within the PSUs that are chosen in the second stage of selection.
Weight(ing)
A post-survey adjustment that may account for differential coverage, sampling, and/or nonresponse processes.

References

[1] Biemer, P.P, & Lyberg, L.E. (2003). Introduction to Survey Quality. Hoboken, NJ: Wiley.

[2] Brackstone, G. (1999). Managing Data Quality in a Statistical Agency. Statistics Canada, Survey Methodology, Catalogue No. 12-001-XPB, 25(2), 1-23;

[3] Couper, M.P., & Lyberg, L. (2005). "The Use of Paradata in Survey Research." Paper presented at the International Statistical Institute Meetings, Sydney, April.

[4] Eurostat (2003). Methodological Documents — Definition of Quality in Statistics. Report of the Working Group Assessment of Quality in Statistics, item 4.2. Luxembourg, October.

[5] Eurostat. (2003). Methodological Documents — Standard Report. Report of the Working Group Assessment of Quality in Statistics, item 4.2B. Luxembourg, October.

[6] Groves, R.M., Couper, M.P., Lepkowski, J.M., Singer, E., & Tourangeau, R. (2004). Survey Methodology. Hoboken, NJ: Wiley.

[7] Institute for Social and Economic Research [ISER] (2006). Quality Profile: British Household Panel Survey (Version 2.0). University of Essex. http://www.iser.essex.ac.uk/ulsc/bhps/quality-profiles/BHPS-QP-01-03-06-v2.pdf

[8] Juran , J.M., and Gryna, Jr., F.M. (1980). Quality Planning and Analysis, 2nd ed., McGraw-Hill, New York.

[9] Lyberg, L.E., Biemer, P., Collins, M., DeLeeuw, E.D., Dippo, C., Schwarz, N., & Trewin, D (eds.). 1997. Survey Measurement and Process Quality. New York: Wiley.

[10] U.S. Bureau of the Census (1998). SIPP Quality Profile. SIPP Working Paper No. 230 (3rd Edition). http://www.census.gov/sipp/workpapr/wp230.pdf

Return to top

Previous chapter | Next chapter | Home

© 2008 The authors of the Guidelines hold the copyright. Please contact us if you wish to publish any of this material in any form.