The key idea that underpins this book is that raw parameter estimates obtained by fitting a model can often be transformed into more interpretable quantities. Presenting results in a way that resonates with the audience enhances clarity, communication, and impact.
Unfortunately, computing intuitive statistical quantities, along with their standard errors, can be a tedious and error-prone process. Furthermore, whereas many excellent software packages exist to fit statistical and machine learning models, these packages often behave in idiosyncratic ways. They often produce objects with incompatible structures, content, and behavior, which makes it difficult for analysts to maintain a consistent workflow across projects.
marginaleffects
for R
and Python
To address this challenge, this book introduces the free and open source software package marginaleffects
, which provides a single point of entry for interpreting results from over 100 different model classes in R
and Python
. This package simplifies the interpretation process by offering a consistent and powerful user interface, reducing the need for customized code, and minimizing the risk of errors.
Table 2.1 lists the main functions of the marginaleffects
package. These functions allow analysts to compute a wide range of quantities, grouped into three categories: predictions()
, comparisons()
, and slopes()
.
-
predictions
: This family of functions computes and plots predictions on different scales, at different levels of aggregation (Chapter 5).
-
comparisons
: This family of functions computes and plots counterfactual comparisons which can caracterize the relationships between two or more variables (Chapter 6). This broad class of estimands includes contrasts, differences, risk ratios, odds ratios, lift, or even user-defined functions.
-
slopes
: This family of functions computes and plots partial derivatives of the outcome equation, commonly called “marginal effects” in econometrics or “trends” in other disciplines.
Because computing average predictions, comparisons, and slopes is common practice, the marginaleffects
package offers shortcut functions: avg_predictions()
, avg_comparisons()
, and avg_slopes()
. These functions are simple wrappers around the main workhorse functions, returning averages over the whole dataset or by subgroup. Their purpose is to save keystrokes and improve code readability.
The marginaleffects
package includes many more powerful utilities. The hypotheses()
function and hypothesis
argument allows analysts to conduct linear and non-linear hypothesis tests on parameter estimates, or on any of the other quantities produced by the package. This makes it easy to make cross-group comparisons, compare different effect sizes, and more. The datagrid()
function is a convenient function to to create grids of predictor values; inferences()
implements alternative inferential strategies like the bootstrap; and posterior_draws()
makes it easy to extract draws from posterior distributions in Bayesian analyses.
The functions in marginaleffects
greatly simplify the analysis of randomized experiments and play a central role in analyzing observational data, such as in matching, inverse probability weighting, G-computation, multi-level regression with post-stratification (MRP), conjoint experiments, and multiple imputation for missing data.
All these features are consolidated into a single software package, available in two programming languages, and compatible with over 100 types of models—more than any comparable package. Supported models include linear, generalized linear (GLM), generalized additive (GAM), mixed-effects, fixed-effects, Bayesian models, and more.