Publications

2023

August 2023

Simultaneous adjustment of uncontrolled confounding, selection bias and misclassification in multiple-bias modelling International Journal of Epidemiology

Adjusting for multiple biases usually involves adjusting for one bias at a time, with careful attention to the order in which these biases are adjusted. A novel, alternative approach to multiple-bias adjustment involves the simultaneous adjustment of all biases via imputation and/or regression weighting. The imputed value or weight corresponds to the probability of the missing data and serves to 'reconstruct' the unbiased data that would be observed based on the provided assumptions of the degree of bias.

Paul Brendel, Aracelis Torres, Onyebuchi Arah

August 2023

Monotonicity: Detection, Refutation, and Ramification

The assumption of monotonicity, namely that outputs cannot decrease when inputs increase, is critical for many reasoning tasks, including unit selection, A/B testing, and quasi-experimental econometrics. It is also vital for identifying Probabilities of Causation, which, in turn, enable the estimation of individual-level behavior. This paper demonstrates how monotonicity can be detected (or refuted) using observational, experimental, or combined data. Using such data, we pinpoint regions where monotonicity is definitively violated, where it unequivocally holds, and where its status remains undetermined. We further explore the consequences of monotonicity violations, especially when a maximum percentage of possible violation is specified. Finally, we illustrate applications for personalized decision-making.

Scott Mueller, Judea Pearl

Software: Monotonicity regions, Link: Interactive plot for necessary and sufficient regions of monotonicity

July 2023

From “Is it unconfounded?” to “How much confounding would it take?”: Applying the sensitivity-based approach to assess causes of support for peace in Colombia The Journal of Politics

Attention to the credibility of causal
claims has increased tremendously in recent years. When relying on
observational data, debate often centers on whether investigators have ruled
out any bias due to confounding. However, the relevant
scientific question is generally not whether bias is precisely zero, but
whether it is problematic enough to alter one’s research conclusion. We argue
that sensitivity analyses would improve research practice by showing how
results would change under plausible degrees of confounding, or equivalently,
by revealing what one must argue about the strength of confounding to sustain a
research conclusion. This would improve scrutiny of studies in which non-zero
bias is expected, and of those where authors argue for zero bias but results
may be fragile to confounding too weak to be ruled out. We illustrate this
using off-the-shelf sensitivity tools to examine two potential influences on
support for the FARC peace agreement in Colombia.

Chad Hazlett, Francesca Parente

January 2022

Causal Effect of Chronic Pain on Mortality Through Opioid Prescriptions: Application of the Front-Door Formula Epidemiology

Background: Chronic pain is the leading cause of disability worldwide and is strongly associated with the epidemic of opioid overdosing events. However, the causal links between chronic pain, opioid prescriptions, and mortality remain unclear.

Methods: This study included 13,884 US adults aged ≥20 years who provided data on chronic pain in the National Health and Nutrition Examination Survey 1999-2004 with linkage to mortality databases through 2015. We employed the generalized form of the front-door formula within the structural causal model framework to investigate the causal effect of chronic pain on all-cause mortality mediated by opioid prescriptions.

Results: We identified a total of 718 participants at 3 years of follow-up and 1260 participants at 5 years as having died from all causes. Opioid prescriptions increased the risk of all-cause mortality with an estimated odds ratio (OR) (95% confidence interval) = 1.5 (1.1, 1.9) at 3 years and 1.3 (1.1, 1.6) at 5 years. The front-door formula revealed that chronic pain increased the risk of all-cause mortality through opioid prescriptions; OR = 1.06 (1.01, 1.11) at 3 years and 1.03 (1.01, 1.06) at 5 years. Our bias analysis showed that our findings based on the front-door formula were likely robust to plausible sources of bias from uncontrolled exposure-mediator or mediator-outcome confounding.

Conclusions: Chronic pain increased the risk of all-cause mortality through opioid prescriptions. Our findings highlight the importance of careful guideline-based chronic pain management to prevent death from possibly inappropriate opioid prescriptions driven by chronic pain.

Kosuke Inoue, Beate Ritz, Onyebuchi Arah

December 2020

Understanding, choosing, and unifying multilevel and fixed effect approaches. Political Analysis Political Analysis

When working with grouped data, investigators may choose between “fixed effects” models (FE) with specialized (e.g., cluster-robust) standard errors, or “multilevel models" (MLMs) employing “random effects”. We review the claims given in published works regarding this choice, then clarify how these approaches work and compare by showing that: (i) random effects employed in MLMs are simply “regularized” fixed effects; (ii) unmodified MLMs are consequently susceptible to bias—but there is a longstanding remedy; and (iii) the “default” MLM standard errors rely on narrow assumptions that can lead to under coverage in many settings. Our review of over 100 papers using MLM in political science, education, and sociology show that these “known” concerns have been widely ignored in practice. We describe how to debias MLM’s coeicient estimates, and provide an option to more flexibly estimate their standard errors. Most illuminating, once MLMs are adjusted in these two ways the point estimate and standard error for the target coeicient are exactly equal to those of the analogous FE model with cluster-robust standard errors. For investigators working with observational data and who
are interested only in inference on the target coefficient, either approach is equally appropriate and preferable to uncorrected MLM.

Chad Hazlett, Leonard Wainstein

July 2020

Wildfire Exposure Increases Pro-Climate Political Behaviors American Political Science Review

One political barrier to climate reforms is the temporal mismatch between short-term policy costs and long-term policy benefits. Will public support for climate reforms increase as climate-related disasters make the short-term costs of inaction more salient? Leveraging variation in the timing of Californian wildfires, we evaluate how exposure to a climate-related hazard influences political behavior, rather than self-reported attitudes or behavioral intentions. We show that wildfires
increased support for costly, climate-related ballot measures by 5 to 6
percentage points for those living within 5km of a recent wildfire, decaying to near zero beyond a distance of 15km. This effect is concentrated in Democratic-voting areas, and nearly zero in Republican-dominated areas. We conclude that experienced climate threats can enhance willingness-to-act but largely in places where voters are known to believe in climate change.

Chad Hazlett, Matto Mildenberger

July 2020

Inference without randomization or ignorability: A stability-controlled quasi-experiment on the prevention of tuberculosis. Statistics in Medicine

The stability-controlled quasi-experiment (SCQE) is an approach to study the effects of nonrandomized, newly adopted treatments. While covariate adjustment techniques rely on a “no unobserved confounding” assumption, SCQE imposes an assumption on the change in the average nontreatment outcome between successive cohorts (the “baseline trend”). We provide inferential tools for SCQE and its first application, examining whether isoniazid preventive therapy (IPT) reduced tuberculosis (TB) incidence among 26,715 HIV patients inTanzania. After IPT became available, 16% of untreated patients developed TB within a year, compared with only 0.5% of patients under treatment. Thus, asimple difference in means suggests a 15.5 percentage point (pp) lower risk(p≪.001).Adjusting for covariates using numerous techniques leaves this effectively unchanged. Yet, due to confounding biases, such estimates can be misleading regardless of their statistical strength. By contrast, SCQE reveals valid causaleffect estimates for any chosen assumption on the baseline trend. For example,assuming a baseline trend near 0 (no change in TB incidence over time, absent this treatment) implies a small and insignificant effect. To argue IPT was beneficial requires arguing that the nontreatment incidence would have risen by atleast 0.7 pp per year, which is plausible but far from certain. SCQE may produce narrow estimates when the plausible range of baseline trends can be sufficiently constrained, while in every case it tells us what baseline trends must be believed in order to sustain a given conclusion, protecting against inferences that rely upon infeasible assumptions.

Chad Hazlett, Werner Maokola, David Ami Wulf

January 2020

Analyzing Selection Bias for Credible Causal Inference: When in Doubt, DAG It Out. Epidemiology Epidemiology

Causal modeling and inference rely on strong assumptions, one of which is conditional exchangeability. Uncontrolled confounding is often seen as if it is the most important threat to conditional exchangeability although collider-stratification bias or selection bias can be just as important. 1–4 In this issue of the journal, Flanders and Ye. 5 henceforth, F&Y) and Smith and VanderWeele 6 (henceforth, S&VW) present their results on new bounds—limits that selection bias would not exceed in any specified context—and accompanying summary measures for the values of the selection bias bounding factors that will be enough to explain away any observed association between the exposure and the outcome on the risk ratio or relative risk scale, with risk difference results given in the appendix of S&VW’s article. These articles on M-bias or selection bias fit into a growing body of work that have renewed researchers’ interests in selection bias including the recent overlapping literature on generalizability and transportability, and bounding factors and related summary measures for bias analysis.

Onyebuchi Arah

October 2019

The effect of personal violence on attitudes towards peace in Darfur Journal of Conflict Resolution

Does exposure to violence motivate individuals to support further violence, or to seek peace? Such questions are central to our understanding of how conflicts evolve, terminate, and recur. Yet, convincing empirical evidence as to which response dominates, even in a specific case, has been elusive, owing to the inability to rule out confounding biases. This paper employs a natural experiment based on the indiscriminacy of violence within villages in Darfur to examine how refugees' experiences of violence affect their attitudes toward peace. The results are consistent with a pro-peace or "weary" response: individuals directly harmed by violence were more likely to report that peace is possible, and less likely to demand execution of their enemies. This provides micro-level evidence supporting earlier country-level work on "war-weariness," and extends the growing literature on the effects of violence on individuals by including attitudes toward peace as an important outcome. These findings suggest that victims harmed by violence during war can play a positive role in settlement and reconciliation processes.

Chad Hazlett

January 2019

A Persuasive Peace: Syrian refugees' attitudes towards compromise and civil war termination Journal of Peace Research

Civilians who have fled violent conflict and
settled in neighboring countries are integral to processes of civil war
termination. Contingent on their attitudes, they can either back peaceful
settlements or support warring groups and continued fighting. Attitudes toward
peaceful settlement are expected to be especially obdurate for civilians who
have been exposed to violence. In a survey of 1,120 Syrian refugees in Turkey
conducted in 2016, we use experiments to examine attitudes towards two critical
phases of conflict termination -- a ceasefire and a peace agreement. We examine
the rigidity/flexibility of refugees' attitudes to see if subtle changes in how
wartime losses are framed or in who endorses a peace process can shift
willingness to compromise with the incumbent Assad regime. Our results
show, first, that refugees are far more likely to agree to a ceasefire proposed
by a civilian as opposed to one proposed by armed actors from either the Syrian
government or the opposition. Second, simply describing the refugee community's
wartime experience as suffering rather than sacrifice substantially increases
willingness to compromise with the regime to bring about peace. This effect
remains strong among those who experienced greater violence. Together, these
results show that even among a highly pro-opposition population that has
experienced severe violence, willingness to settle and make peace are
remarkably flexible and dependent upon these cues.

Kristin Fabbe, Chad Hazlett, Tolga Sinmazdemir

September 2018

Estimating causal effects of new treatments despite self-selection: The case of experimental medical treatments Journal of Causal Inference

Providing terminally ill patients with access to experimental treatments,
as allowed by recent “right to try” laws and “expanded access” programs, poses a variety of ethical questions. While practitioners and investigators may assume it is impossible to learn the effects of these treatment without randomized trials, this paper describes a simple tool to estimate the effects of these experimental treatments on those who take them, despite the problem of selection into treatment, and without assumptions about the selection process. The key assumption is that the average outcome, such as survival, would remain stable over time in the absence of the new treatment. Such an assumption is unprovable, but can often be credibly judged by reference to historical data and by experts familiar with the disease and its treatment. Further, where this
assumption may be violated, the result can be adjusted to account for a
hypothesized change in the non-treatment outcome, or to conduct a sensitivity analysis. The method is simple to understand and implement, requiring just four numbers to form a point estimate. Such an approach can be used not only to learn which experimental treatments are promising, but also to warn us when treatments are actually harmful – especially when they might otherwise appear to be beneficial, as illustrated by example here. While this note focuses on experimental medical treatments as a motivating case, more generally this approach can be employed where a new treatment becomes available or has a large increase in uptake, where selection bias is a concern, and where an assumption on the change in average non-treatment outcome over time can credibly be imposed.

Chad Hazlett

March 2018

Covariate Balancing Propensity Score for a Continuous Treatment: Application to the efficacy of political advertisements Annals of Applied Statistics

Propensity score matching and weighting are popular methods when estimating causal effects in observational studies. Beyond the assumption of unconfoundedness, however, these methods also require the model for the propensity score to be correctly specified. The recently proposed covariate balancing propensity score (CBPS) methodology increases the robustness to model misspecification by directly optimizing sample covariate balance between the treatment and control groups. In this paper, we extend the CBPS to a continuous treatment. We propose the covariate balancing generalized propensity score (CBGPS) methodology, which minimizes the association between covariates and the treatment. We develop both parametric and nonparametric approaches and show their superior performance over the standard maximum likelihood estimation in a simulation study. The CBGPS methodology is applied to an observational study, whose goal is to estimate the causal effects of political advertisements on campaign contributions. We also provide open-source software that implements the proposed methods.

For R users, CBPS can be
installed from CRAN: >install.packages("CBPS")

Christian Fong, Chad Hazlett, Kosuke Imai

Software: R package

January 2018

Making Sense of Sensitivity: Extending Omitted Variable Bias Journal of the Royal Statistical Society, Series B (Statistical Methodology)

In this paper we extend the familiar "omitted variable bias"
framework, creating a suite of tools for sensitivity analysis of regression
coefficients and their standard errors to unobserved confounders that: (i) do
not require assumptions about the functional form of the treatment assignment
mechanism nor the distribution of the unobserved confounder(s); (ii) can be
used to assess the sensitivity to multiple confounders, whether they influence
the treatment or the outcome linearly or not; (iii) facilitate the use of
expert knowledge to judge the plausibility of sensitivity parameters; and, (iv)
can be easily and intuitively displayed, either in concise regression tables or
more elaborate graphs. More precisely, we introduce two novel measures for
communicating the sensitivity of regression results that can be used for
routine reporting. The "robustness value" describes the association
unobserved confounding would need to have with both the treatment and the
outcome to change the research conclusions. The partial R-squared of the
treatment with the outcome shows how strongly confounders explaining all of the
outcome would have to be associated with the treatment to eliminate the
estimated effect. Next, we provide intuitive graphical tools that allow
researchers to make more elaborate arguments about the sensitivity of not only
point estimates but also t-values (or p-values and confidence intervals). We
also provide graphical tools for exploring extreme sensitivity scenarios in
which all or much of the residual variance is assumed to be due to confounders.
Finally, we note that a widespread informal "benchmarking" practice
can be widely misleading, and introduce a novel alternative that allows
researchers to formally bound the strength of unobserved confounders "as
strong as" certain covariate(s) in terms of the explained variance of the
treatment and/or the outcome. We illustrate these methods with a running
example that estimates the effect of exposure to violence in western Sudan on
attitudes toward peace.

Carlos Cinelli, Chad Hazlett

Software: R sensemakr, STATA sensemakr, Python sensemakr, Shinyapp

March 2017

Bias Analysis for Uncontrolled Confounding in the Health Sciences Annual Review of Public Health

Uncontrolled confounding due to unmeasured confounders biases causal inference in health science studies using observational and imperfect experimental designs. The adoption of methods for analysis of bias due to uncontrolled confounding has been slow, despite the increasing availability of such methods. Bias analysis for such uncontrolled confounding is most useful in big data studies and systematic reviews to gauge the extent to which extraneous preexposure variables that affect the exposure and the outcome can explain some or all of the reported exposure-outcome associations. We review methods that can be applied during or after data analysis to adjust for uncontrolled confounding for different outcomes, confounders, and study settings. We discuss relevant bias formulas and how to obtain the required information for applying them. Finally, we develop a new intuitive generalized bias analysis framework for simulating and adjusting for the amount of uncontrolled confounding due to not measuring and adjusting for one or more confounders.

Onyebuchi Arah

October 2015

G-computation demonstration in causal mediation analysis European Journal of Epidemiology

Recent work has considerably advanced the definition, identification and estimation of controlled direct, and natural direct and indirect effects in causal mediation analysis. Despite the various estimation methods and statistical routines being developed, a unified approach for effect estimation under different effect decomposition scenarios is still needed for epidemiologic research. G-computation offers such unification and has been used for total effect and joint controlled direct effect estimation settings, involving different types of exposure and outcome variables. In this study, we demonstrate the utility of parametric g-computation in estimating various components of the total effect, including (1) natural direct and indirect effects, (2) standard and stochastic controlled direct effects, and (3) reference and mediated interaction effects, using Monte Carlo simulations in standard statistical software. For each study subject, we estimated their nested potential outcomes corresponding to the (mediated) effects of an intervention on the exposure wherein the mediator was allowed to attain the value it would have under a possible counterfactual exposure intervention, under a pre-specified distribution of the mediator independent of any causes, or under a fixed controlled value. A final regression of the potential outcome on the exposure intervention variable was used to compute point estimates and bootstrap was used to obtain confidence intervals. Through contrasting different potential outcomes, this analytical framework provides an intuitive way of estimating effects under the recently introduced 3- and 4-way effect decomposition. This framework can be extended to complex multivariable and longitudinal mediation settings.

Aolin Wang, Onyebuchi Arah

October 2015

Kernel Balancing: A flexible non-parametric weighting procedure for estimating causal effects Statistica Sinica

Matching and weighting methods are widely used to estimate causal effects when adjusting for a set of observables is required. Matching is appealing for its non-parametric nature, but with continuous variables, is not guaranteed to remove bias. Weighting techniques choose weights on units to ensure pre-specified functions of the covariates
have equal (weighted) means for the treated and control group. This assures unbiased effect estimation only when the potential outcomes are linear in those pre-specified functions of the observables. Kernel balancing begins by assuming the expectation of the non-treatment potential outcome conditional on the covariates falls in a large, flexible space of functions associated with a kernel. It then constructs linear bases for this function space and achieves approximate balance on these bases. A worst-case bound on the bias due to this approximation is given and is the target of minimization. Relative to current practice, kernel balancing offers one reasoned solution to the long-standing question of which functions of the covariates investigators should attempt to achieve (and check) balance on. Further, these weights are also those that would make the estimated multivariate density of covariates approximately the same for the treated and control groups, when the same choice of kernel is used to estimate those densities. The approach is fully automated up to the choice of a kernel and smoothing parameter, for which default options and guidelines are provided. An R package, KBAL, implements this approach.

Chad Hazlett

Software: R package, Link: Supplement

January 2011

Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders Epidemiology

Uncontrolled confounding in observational studies gives rise to biased effect estimates. Sensitivity analysis techniques can be useful in assessing the magnitude of these biases. In this paper, we use the potential outcomes framework to derive a general class of sensitivity-analysis formulas for outcomes, treatments, and measured and unmeasured confounding variables that may be categorical or continuous. We give results for additive, risk-ratio and odds-ratio scales. We show that these results encompass a number of more specific sensitivity-analysis methods in the statistics and epidemiology literature. The applicability, usefulness, and limits of the bias-adjustment formulas are discussed. We illustrate the sensitivity-analysis techniques that follow from our results by applying them to 3 different studies. The bias formulas are particularly simple and easy to use in settings in which the unmeasured confounding variable is binary with constant effect on the outcome across treatment levels.

Tyler J. VanderWeele, Onyebuchi Arah

August 2008

Bias formulas for external adjustment and sensitivity analysis of unmeasured confounders Annals of Epidemiology

Purpose: Uncontrolled confounders are an important source of bias in epidemiologic studies. The authors review and derive a set of parallel simple formulas for bias factors in the risk difference, risk ratio, and odds ratio from studies with an unmeasured polytomous confounder and a dichotomous exposure and outcome.

Methods: The authors show how the bias formulas are related to and are sometimes simpler than earlier formulas. The article contains three examples, including a Monte Carlo sensitivity analysis of a preadjusted or conditional estimate.

Results: All the bias expressions can be given parallel formulations as the difference or ratio of (i) the sum across confounder strata of each exposure-stratified confounder-outcome effect measure multiplied by the confounder prevalences among the exposed and (ii) the sum across confounder strata of the same effect measure multiplied by the confounder prevalences among the unexposed. The basic formulas can be applied to scenarios with a polytomous confounder, exposure, or outcome.

Conclusions: In addition to aiding design and analysis strategies for confounder control, the bias formulas provide a link between classical standardization decompositions of demography and classical bias formulas of epidemiology. They are also useful in constructing general programs for sensitivity analysis and more elaborate probabilistic risk analyses.

Onyebuchi Arah, Yasutaka Chiba, Sander Greenland