What happens when one tries to analyse women’s labour force participation using data collected from a voluntary survey. There will obviously be a self-selection bias, that is, we will observe labour force participation choice made by only those women who chose to fill out the census form. As for those women who opted not to fill the census form, we would have no knowledge about their decision to participate in labour force or otherwise.
Now armed with a data set that suffers from self selection bias, we would have to face the challenge of analysing wages for only those women who chose to fill out the form and then chose to participate in the labour force. There are now two self-selection biases in play: one dealing with filling out the census, and the other dealing with participating in the labour force.
“The only winner here is Jim Heckman [recipient of the the 2000 Nobel Memorial Prize in Economic Sciences (with Daniel McFadden)] as his cite count will go up as smart researchers in sociology, demography and economics will have to come up with a selection correction to model who bothers to fill out the form and then a researcher studying women's labor force participation would need to estimate a second selection correction. I'd like to see somebody work out the "three step" standard error formula in this case!”
What Mathew means is that a researcher has to account for not just one, but two self-selection biases to be able to estimate unbiased coefficients in an econometric model that explains wage differentials between men and women. Obviously, this fine point will be lost on those who never made it beyond Statistics 101. For them, Mathew Kahn offers a simpler example:
“Suppose that highly motivated busy people don't bother [filling out the Census], then the "average" person in Canada will look "lazy" because the sample who fills out the survey will omit the high achievers.”
Put simply, the voluntary census long form will make Canada lazier than it really is!
Lastly, a paper published on census response rates suggested that racially and socio-economically diverse communities reported lower response rates at the county level. This implies that racial minorities are more involved in census when there numbers increase within the community.
Furthermore, the US Census Bureau in 2003 compared the response rates for the mandatory American Community Survey by sending a portion of the respondents the same survey but indicating that their response was voluntary. The change from mandatory to voluntary survey resulted in a huge 20.7% decline in response rate. At the same time the Bureau observed that the highest decline in response rate occurred in communities with high response rates recorded for the Census conducted in 2000.
These issues have serious implications for converting the mandatory long form into a voluntary survey.
Community Composition and Collective Action: Analyzing Initial Mail Response to the 2000 Census
Jacob L. Vigdor (Duke University)
The Review of Economics and Statistics, February 2004, 86(1): 303–312This paper analyzes how community heterogeneity influences resident decisions to undertake actions generating public benefits. The decision in question is completing and returning the 2000 Census questionnaire, an action which secures a significant amount of federal grants for the community. The model developed to explain this action allows members of societal groups to differentially value public benefits that accrue to other group members. Racial, generational, and socioeconomic class heterogeneity all predict significantly lower response rates at the county level. The potential for endogenous sorting into heterogeneous counties implies that the magnitude of true behavioral effects exceeds these estimates. Copyright (c) 2004 President and Fellows of Harvard College and the Massachusetts Institute of Technology.