Data Science for Business Applications

Class 09 - Randomized Control Trials

Potential Outcomes

Last week we discussed potential outcomes., (e.g. \(Y_i(1)\) and \(Y_i(0)\)):
“The outcome that we would have observed under different scenarios”
Potential outcomes are related to your choices/possible conditions:
One for each path (Counterfactuals).
Do not confuse them with the values that your outcome variable can take.
Definition of Causal Effect for individual \(i\): \[ \text{causal effect for an individual} = Y_{i}(1)- Y_{i}(0) \]
Better to assume for a population (Difference in means) \[ \text{ATE} = E\left[Y_{i}(1)- Y_{i}(0)\right] = E\left[Y_{i}(1)\right] - E\left[Y_{i}(0)\right] \]

Causal effect

For a sample:

\[ \text{Average} [Y_{i}(1)- Y_{i}(0)] = \text{mean of the treated} - \text{mean of the untreated} \]

Under what assumptions is our estimate causal?
Key assumption: Ignorability means that the potential outcomes \(Y_i(0)\) and \(Y_i(1)\) are independent of the treatment.
In our example this means that the decision to pursue a college degree should not be related to unmeasured factors that could influence income.
In reality, this assumption can be difficult to fully satisfy. There could be unobserved factors, such as intrinsic ability or motivation, that affect both the likelihood of obtaining a college degree and future income, leading to potential confounding.
What can we do to make the ignorability assumption hold?

Randomization

One way to make sure the ignorability assumption holds is to do it by design:

Randomize the assignment of the treatment \(Z\)
i.e. Some units will randomly be chosen to be in the treatment group and others to be in the control group.
What does randomization buy us?
Control for unforeseen factors (confounders)

Confounders

Confounder is a variable that affects both the treatment AND the outcome

Confounders

Let’s identify some confounders

Estimate the effect of insurance vs no insurance on number of accidents \(\rightarrow\) Compare people with insurance vs people without insurance.
Confounder: (Driving Behavior/Risk Aversion) Risk-averse individuals are more likely to purchase insurance and may also drive more cautiously, reducing their number of accidents.
Estimate the effect of gym membership vs no gym membership on physical health \(\rightarrow\) Compare people with gym memberships vs people without gym memberships.
Confounder: (Motivation for Fitness) Individuals who are more motivated to improve their health are more likely to purchase a gym membership and are also more likely to engage in other healthy behaviors, such as maintaining a balanced diet, which improves their physical health.

Randomization

Due to randomization, we know that the treatment is not affected by a confounder

We have “clean effect” of the treatment on the outcome
This would be the causal effect of the treatment

Randomized controlled trials (RCTs)

Often called the “gold standard” for establishing causality.
Randomly assign the \(Z\), “treatment”, to participants
Now, any observed relationship between \(Z\) and \(Y\) must be due to \(Z\), since the only reason an individual had a particular value of \(X\) was the random assignment.

Randomized controlled trials (RCTs)

RCT - Steps

Check for balance

(We will see what this is about)

Randomize
Calculate difference in sample means between treatment and control group

Example 1: Clinical Trial for the Moderna COVID-19 vaccine

Randomly assign study participants to get either the vaccine:

an treatment group of 14,134 people
control group, the same size
Results: 11 vaccine recipients got COVID; 235 of placebo recipients got COVID

library(mosaic)

# Control and treatment group 

# Difference in proportions
prop.test(outcome ~ treatment, data = data.rct, success = 1)


    2-sample test for equality of proportions with continuity correction

data:  tally(outcome ~ treatment)
X-squared = 215.01, df = 1, p-value < 2.2e-16
alternative hypothesis: two.sided
95 percent confidence interval:
 0.01435140 0.01890174
sample estimates:
      prop 1       prop 2 
0.0174048394 0.0007782652

Issues with RCT

Internal validity is the ability of an experiment to establish cause-and-effect of the treatment within the sample studied.
Examples of threats to internal validity:
Failure to randomize.
Failure to follow the treatment protocol/attrition.
Small sample sizes

Issues with RCT

External validity is the ability of an experimental result to generalize to a larger context or population.
Examples of threats to external validity:
Failure to randomize.
Non representative samples.
Non representative protocol/policy.

Blocking

Randomization works “on average” but we only get one opportunity at creating treatment and control groups, and there might be imbalances in “nuisance” variables that could affect the outcome.
For example, what will happen if the treatment group for the Moderna trial happens to get younger people in it than the control group?
We can solve this by blocking or stratifying: randomly assigning to treatment/control within groups.

Blocking

Unbalanced sample

Blocking

Blocking or stratification sample

Blocking in vaccine trial

In the Moderna vaccine trial, they identified two possible variables that could impact COVID outcomes:
Age (65+ vs under 65)
Underlying health condition

Blocking in vaccine trial

Experiments using regression

Non-blocked design: use a simple regression \[ \widehat{Y} = \widehat{\beta}_0 + \widehat{\beta}_1 T, \]
where \(T\) is a dummy variable that is \[ T = \begin{cases} 1, & \text{for the treatment group}, \\ 0, & \text{for the control group} \end{cases} \]
\(\widehat{\beta}_1\) represents the estimated average treatment effect. The regression needs to be logistic if Y is categorical!

Experiments using regression

Blocked design: use a regression that controls for the blocking variable \(B\):

\[ \widehat{Y} = \widehat{\beta}_0 + \widehat{\beta}_1 T + \widehat{\beta}_2 B, \]

where \(B\) is the fixed effect of each strata, that are interactions between categories.
Important: the regression needs to be logistic if \(Y\) is categorical.

Get Out The Vote (GOTV)

Fact: lots of people don’t vote.
It’s important for people to vote, to ensure that our government reflects the will of its constituents.
How do we get people to vote?

Get Out The Vote (GOTV)

In 2002, researchers at Temple and Yale conducted a large phone banking experiment to see calling voters helps:
From among about 381,062 phone numbers of voters in Iowa and Michigan they randomly contacted about 12000 voters
The outcome Y of interest is whether each voter actually voted.

No blocking

Estimating the average treatment effect with logistic regression:

glm = glm(vote02 ~ treatment,data = GOTV, family = "binomial")
summary(glm)


Call:
glm(formula = vote02 ~ treatment, family = "binomial", data = GOTV)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.184717   0.003306  55.870   <2e-16 ***
treatmenttreatment 0.170824   0.018843   9.066   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 524839  on 381061  degrees of freedom
Residual deviance: 524756  on 381060  degrees of freedom
AIC: 524760

Number of Fisher Scoring iterations: 3

The coefficients are in log odds.

No blocking

The average treatment effect will be of approximately 19%

(exp(0.17)-1)*100

[1] 18.53049

confint(glm)

                       2.5 %    97.5 %
(Intercept)        0.1782378 0.1911978
treatmenttreatment 0.1339278 0.2077954

Receiving a phone call increases the likelihood of voting by 19% compared to those who did not receive a call.
Confidence interval for the treatment

confint(glm)

                       2.5 %    97.5 %
(Intercept)        0.1782378 0.1911978
treatmenttreatment 0.1339278 0.2077954

Blocking

The researchers actually used a blocking design with two variables that they thought could impact voting rates (separately from the phone calls):
The state of the voter (Iowa (0) or Michigan (1))
Whether the voter was in a “competitive” district (one where there was likely to be a close election)

Blocking

GOTV = GOTV %>%
       mutate(block = interaction(state, competiv))
glm_vote = glm(vote02 ~ treatment + block, data = GOTV, family = 'binomial')
summary(glm_vote)


Call:
glm(formula = vote02 ~ treatment + block, family = "binomial", 
    data = GOTV)

Coefficients:
                   Estimate Std. Error z value Pr(>|z|)    
(Intercept)        0.043236   0.004146   10.43   <2e-16 ***
treatmenttreatment 0.028542   0.019279    1.48    0.139    
block1.1           0.351686   0.015168   23.19   <2e-16 ***
block0.2           0.196691   0.008866   22.18   <2e-16 ***
block1.2           0.603739   0.009515   63.45   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 524839  on 381061  degrees of freedom
Residual deviance: 520331  on 381057  degrees of freedom
AIC: 520341

Number of Fisher Scoring iterations: 4

confint(glm_vote)

                          2.5 %     97.5 %
(Intercept)         0.035109260 0.05136249
treatmenttreatment -0.009210835 0.06636325
block1.1            0.321979732 0.38143873
block0.2            0.179317682 0.21407167
block1.2            0.585102929 0.62239941

Blocking

The effect of the treatment is not significant under blocking.
What if some callers didn’t stick to the script?
Many people didn’t answer the phone!
What about voters outside of the Midwest?

The limitations of RCTs

Although powerful for inferring causation, RCTs are difficult to apply.
They can be incredibly expensive.
Compliance with the treatment protocol isn’t perfect (e.g., mask-wearing, picking up the phone)
It can be hard to generalize beyond the participants involved in the study.
They can be impractical or (e.g., effect of education on performance) or unethical to conduct (e.g., seatbelts, parachutes, even medical trials)