In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment. PSM attempts to reduce the bias due to confounding variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes among units that received the treatment versus to those that did not. The technique was first published by Paul Rosenbaum and Donald Rubin in 1983,[1] and implements the Rubin causal model for observational studies.

Images Source: Flickr. Images licensed under the Creative Commons CC-BY-SA
(Redirected from Propensity score)

In the statistical analysis of observational data, propensity score matching (PSM) is a statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment. PSM attempts to reduce the bias due to confounding variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes among units that received the treatment versus to those that did not. The technique was first published by Paul Rosenbaum and Donald Rubin in 1983,[1] and implements the Rubin causal model for observational studies.

The possibility of bias arises because the apparent difference between these two groups of units may depend on characteristics that affected whether or not a unit received a given treatment instead of due to the effect of the treatment per se. In randomized experiments, the randomization enables unbiased estimation of treatment effects; for each covariate, randomization implies that treatment-groups will be balanced on average, by the law of large numbers. Unfortunately, for observational studies, the assignment of treatments to research subjects is, by definition, not randomized. Matching attempts to mimic randomization by creating a sample of units that received the treatment that is comparable on all observed covariates to a sample of units that did not receive the treatment.

For example, one may be interested to know the consequences of smoking or the consequences of going to university. The people 'treated' are simply those -- the smokers, or the university graduates -- who undergo in course of everyday life whatever it is that is being studied by the researcher. In both of these cases it is unfeasible (and perhaps unethical) to randomly assign people to smoking or a university education, so observational studies are required. The treatment effect estimated by simply comparing a particular outcome -- rate of cancer or life time earnings -- between those who smoked and did not smoke or attended university and did not attend university would be biased by any factors that predict smoking or university attendance, respectively. PSM attempts to control for these differences to make the groups receiving treatment and not-treatment more comparable.

## Overview

PSM is for cases of causal inference and simple selection bias in non-experimental settings in which: (i) few units in the non-experimental comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment unit is difficult because units must be compared across a high-dimensional set of pretreatment characteristics.

In normal Matching we match on single characteristics that distinguish treatment and control groups (to try to make them more alike). But If the two groups do not have substantial overlap, then substantial error may be introduced: E.g., if only the worst cases from the untreated “comparison” group are compared to only the best cases from the treatment group, the result may be regression toward the mean which may make the comparison group look better or worse than reality.

PSM employs a predicted probability of group membership e.g., treatment vs. control group—based on observed predictors, usually obtained from logistic regression to create a counterfactual group. Also propensity scores may be used for matching or as covariates—alone or with other matching variables or covariates.

## General procedure

1.Run logistic regression:

• Dependent variable: Y = 1, if participate; Y = 0, otherwise.
• Choose appropriate conditioning (instrumental) variables.
• Obtain propensity score: predicted probability (p) or log[p/(1 − p)].

2.Match each participant to one or more nonparticipants on propensity score:

3.Multivariate analysis based on new sample

• Use analyses appropriate for non-independent matched samples

## Formal definition

A propensity score is the probability of a unit (e.g., person, classroom, school) being assigned to a particular treatment given a set of observed covariates. Propensity scores are used to reduce selection bias by equating groups based on these covariates.

Suppose that we have a binary treatment T, an outcome Y, and background variables X. The propensity score is defined as the conditional probability of treatment given background variables:

$p(x) \ \stackrel{\mathrm{def}}{=}\ \Pr(T=1 | X=x).$

Let Y(0) and Y(1) denote the potential outcomes under control and treatment, respectively. Then treatment assignment is (conditionally) unconfounded if treatment is independent of potential outcomes conditional on X. This can be written compactly as

$T \perp Y(0), Y(1) \,|\, X$

where $\perp$ denotes statistical independence.

If unconfoundedness holds, then

$T \perp Y(0), Y(1) \,|\, p(X).$

Pearl (2000) has shown that a simple graphical criterion called backdoor provides an equivalent definition of unconfoundedness.[2]

PSM, like any matching procedure, enables estimation of an average treatment effect from observational data. The key advantages of PSM were, at the time of its introduction, that by creating a linear combination of covariates into a single score it allowed researchers to balance treatment and control groups on a large number of covariates without losing a large number of observations. If units in the treatment and control were balanced on a large number of covariates one at a time, large numbers of observations would be needed to overcome the "dimensionality problem" whereby the introduction of a new balancing covariate increases the minimum necessary number of observations in the sample geometrically.

Disadvantages of PSM are many. Among the most critical disadvantage is that PSM can only account for observed (and observable) covariates. Factors that affect assignment to treatment but that cannot be observed cannot be accounted for in the matching procedure. Shadish, Cook, & Campbell (2002) additionally argue that PSM requires large samples, overlap between treatment and control groups must be substantial, and hidden bias may remain after matching because the procedure only controls for observed variables (to the extent that they are perfectly measured).[3]

General concerns with matching have also been raised by Judea Pearl, who has argued that hidden bias may actually increase because matching on observed variables may unleash bias due to dormant unobserved confounders. Similarly, Pearl has argued that bias reduction can only be assured (asymptotically) by modeling the qualitative causal relationships between treatment, outcome, observed and unobserved covariates.[4][5] Confounding occurs when the experimental controls do not allow the experimenter to reasonably eliminate plausible alternative explanations for an observed relationship between independent and dependent variables.

## References

1. ^ Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The central role of the propensity score in observational studies for causal effects". Biometrika 70 (1): 41–55. doi:10.1093/biomet/70.1.41.
2. ^ Pearl, J. (2000). Causality: Models, Reasoning, and Inference, Cambridge University Press.
3. ^ Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
4. ^ Pearl, J. Understanding propensity scores. In Causality: Models, Reasoning, and Inference, Cambridge University Press, Second Edition, 2009.
5. ^ Pearl, J. Causality: Models, Reasoning, and Inference, Cambridge University Press, Second Edition, 2009.