where sample selection come from

  1. self-selection

observed working women have higher market wage than home wage. Reasoning: working status (career) increase the earnings Bias: more skilled woment choose to work (skills increase the earnings )

observed immigrants earn more than non-immigrants Reasoning: immigration increase earning Bias: more skilled workers choose (be able) to work as immigrants

  1. collection-selection

observed spouse income on personal health (panel data) Reasoning: household income influence well-beings Bias: only stable maritual status sample (relation stability influence well-beings)

Overall, the sample selection bias emerges as factors “determining the probability of entrance into the sample” (Hechman, 1979) confound the estimates of interest in a regression model.

sympton of sample selection bias

  1. downward estimation of population variance \(\sigma\)

  2. variables that do not belong in true structural equation appear to be statistically significant when fitting

  3. special cases

  4. multivariate extensions of the preceding analysis might be of substantive interest

Code in R

library(ivreg)
## Warning: package 'ivreg' was built under R version 4.1.3

Author's bio

He Huang (https://huanghe.me) is a PhD student at the University of Texas at Dallas. He is currently enrolled in the IMS PhD Program.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC-SA 4.0. However, some parts may be subjected to the publishers' policies.