"Statistical and Methodological Issues in the Analysis of Complex Sample Survey Data"
Standard methods for the statistical analysis of survey data assume that the data arise from a simple random sample of the target population. In practice, analysts of large public-release survey data sets collected from nationally representative samples (e.g., the National Comorbidity Survey-Replication, or NCS-R) pay little attention to characteristics often associated with survey data, including unequal probabilities of observation, stratified multistage sample designs, and missing data. Most standard statistical procedures in software packages commonly used for data analysis (e.g. SAS, SPSS, and STATA) do not allow the analyst to take all of these properties of survey data into account unless specialized survey procedures are used. Failure to use these specialized procedures can have an important impact on the results of all types of analyses, ranging from simple descriptive statistics to estimates of parameters of multivariate models.
This presentation provides a practical introduction to specialized methods for estimation and inference that have been developed for the analysis of complex sample survey data. Complex sample designs are introduced first, including the development of design-based sampling weights and the motivation for the use of weights when analyzing survey data. Methods for variance estimation in the complex sample design setting are then introduced, along with the differences between design-based and model-based approaches to making inferences from survey data. Available software options for these types of analyses are then discussed, along with multiple imputation approaches for handling missing data. Finally, two motivating illustrations are presented using the Complex Samples module of the SPSS software package for a step-by-step analysis of the NCS-R data set, and important issues regarding the analysis of subclasses are discussed. The presentation will conclude with a general participant question-and-answer session.