38  Total Marginal Effects

Mize and Han (2025) recommend several strategies to summarize the effects of different levels of a categorical independent variable, or the effect of an independent variable across different levels of a categorical dependent variable. This notebook implements some of their strategies in marginaleffects. The quantities are simply defined and computed. Please refer to the original article in Sociological Science for motivation and intepretation.

38.1 Data and model

To begin, we load the required libraries and data from the General Social Survey. We are interested in the effect of the variables race4 and woman on the variable conserv. Since conserv is a binary variable, we fit a logit model.

library(marginaleffects) # marginal effects
library(modelsummary) # regression summaries
library(nanoparquet) # read data
library(nnet) # multinomial logit
library(future.apply) # parallel bootstrap
plan(multisession)
options(marginaleffects_parallel_inferences = TRUE)
options(width = 300)


# download data
gss <- read_parquet(url("https://marginaleffects.com/data/Mize_Han_GSS.parquet"))

# fit binary logit model
mod <- glm(conserv ~ woman + race4 + class, data = gss, family = binomial(link = "logit"))
get_estimates(mod)
                term    estimate  std.error conf.level   conf.low  conf.high  statistic df.error      p.value s.value group
1        (Intercept) -0.39634712 0.13780515       0.95 -0.6697286 -0.1289376 -2.8761416      Inf 4.025691e-03     8.0      
2         womanWomen -0.32748438 0.07292669       0.95 -0.4704960 -0.1845863 -4.4905970      Inf 7.102383e-06    17.1      
3         race4Black -0.83912641 0.14005666       0.95 -1.1203282 -0.5705368 -5.9913351      Inf 2.081252e-09    28.8      
4         race4Other -0.44149926 0.16713947       0.95 -0.7771915 -0.1206812 -2.6415021      Inf 8.253930e-03     6.9      
5      race4Hispanic -0.55132361 0.12594578       0.95 -0.8024685 -0.3082688 -4.3774680      Inf 1.200660e-05    16.3      
6 classworking class -0.06254098 0.14085335       0.95 -0.3359588  0.2167183 -0.4440149      Inf 6.570318e-01     0.6      
7  classmiddle class -0.02527655 0.13814936       0.95 -0.2932887  0.2488157 -0.1829654      Inf 8.548252e-01     0.2      
8   classupper class -0.16316188 0.21868570       0.95 -0.5958619  0.2626431 -0.7461022      Inf 4.556057e-01     1.1      

38.2 Categorical Independent Variable

38.2.1 Average Marginal Effect Inequality

Mize and Han (2025) define the “average marginal effect inequality” as the average of the absolute values of the marginal effects. Here, the expression “marginal effect,” for a categorical predictor, refers to what Chapter 6 referred to as “counterfactual comparisons.” First, we compute the marginal effects for every level of the race4 predictor.

cmp <- avg_comparisons(mod,
  variables = list(race4 = "pairwise"))
cmp

         Contrast Estimate Std. Error      z Pr(>|z|)    S    2.5 %  97.5 %
 Black - White     -0.1607     0.0226 -7.124  < 0.001 39.8 -0.20494 -0.1165
 Hispanic - Black   0.0480     0.0294  1.633  0.10252  3.3 -0.00962  0.1056
 Hispanic - Other  -0.0204     0.0376 -0.541  0.58832  0.8 -0.09410  0.0534
 Hispanic - White  -0.1127     0.0234 -4.814  < 0.001 19.4 -0.15865 -0.0668
 Other - Black      0.0683     0.0371  1.842  0.06552  3.9 -0.00439  0.1411
 Other - White     -0.0924     0.0322 -2.867  0.00414  7.9 -0.15553 -0.0292

Term: race4
Type: response

These results suggest that moving from White to Black is associated with a decrease of 16 percentage points in the probability that conserv equals 1.

The average absolute value of these marginal effects is the “average marginal effect inequality”:

abs(cmp$estimate)
[1] 0.16072324 0.04798034 0.02036276 0.11274290 0.06834311 0.09238014
mean(abs(cmp$estimate))
[1] 0.08375542

We can compute this more easily—and with standard errors—using the hypothesis argument.

avg_comparisons(mod,
  variables = list(race4 = "pairwise"),
  hypothesis = ~ I(mean(abs(x)))
)

 Estimate Std. Error    z Pr(>|z|)    S  2.5 % 97.5 %
   0.0838      0.013 6.45   <0.001 33.1 0.0583  0.109

Type: response

38.2.2 Weighted Average Marginal Effect Inequality

Mize and Han (2025) argue that the quantity computed in the previous section may be inappropriate when there are strong imbalances between the prevalence of different classes in the sample. Their proposed remedy is to weight each of the contrasts by their relative proportions in the sample. First, we create a vector of weights based on sample proportions.

weights <- prop.table(table(gss$race4))
weights <- c(
  (weights["Black"] + weights["White"]) / 3,
  (weights["Hispanic"] + weights["Black"]) / 3,
  (weights["Hispanic"] + weights["Other"]) / 3,
  (weights["Hispanic"] + weights["White"]) / 3,
  (weights["Other"] + weights["Black"]) / 3,
  (weights["Other"] + weights["White"]) / 3
)

Next, we define a wmean() function that computes the weighted mean, and we feed that function to the hypothesis argument.

wmean <- function(x) weighted.mean(x, weights)
avg_comparisons(mod,
  variables = list(race4 = "pairwise"),
  hypothesis = ~ I(wmean(abs(x)))
)

 Estimate Std. Error    z Pr(>|z|)    S  2.5 % 97.5 %
   0.0838      0.013 6.45   <0.001 33.1 0.0583  0.109

Type: response

38.2.3 Comparing within a model

Mize and Han (2025) also propose a strategy to compare the marginal effects of different categorical predictors. The idea is simple. First, we compute the average marginal effect inequality for each preditor. We do this by computing contrasts for the two predictors of interest (race4 and woman) using the variables argument. Then, we compute the quantity of interest using the hypothesis argument, taking care to include |term on the right-hand side of the formula, to compute the quantity of interest separately for the two terms.

cmp <- avg_comparisons(mod,
  variables = list(race4 = "pairwise", woman = "pairwise"),
  hypothesis = ~ I(mean(abs(x))) | term
)
cmp

  Term Estimate Std. Error    z Pr(>|z|)    S  2.5 % 97.5 %
 race4   0.0838     0.0130 6.45   <0.001 33.1 0.0583  0.109
 woman   0.0699     0.0156 4.48   <0.001 17.0 0.0393  0.100

Type: response

We find that the estimate is larger for race4 than for woman. Is this difference statistically significant? Are the two estimates distinguishable from one another? To check this, we pipe the previous command to the hypotheses() function.

avg_comparisons(mod,
  variables = list(race4 = "pairwise", woman = "pairwise"),
  hypothesis = ~ I(mean(abs(x))) | term) |>
  hypotheses(~pairwise)

        Hypothesis Estimate Std. Error      z Pr(>|z|)   S   2.5 % 97.5 %
 (woman) - (race4)  -0.0139      0.021 -0.661    0.509 1.0 -0.0551 0.0273

No. While the two estimates seem different at first glance, we cannot reject the null hypothesis that they are equal.

38.2.4 Comparing across models

Imagine that we wish to compare the marginal effect of the woman predictor in a model that controls for race to the marginaleffect effect of woman in a model that does not control for race. We can do this by fitting two models and then comparing the marginal effects. First, we define an estimator function that fits the models and returns a marginaleffects object with the difference of interest.1

estimator <- function(data) {
  # fit the two models we wish to compare
  mod1 <- glm(conserv ~ woman + class + race4, data = data, family = binomial(link = "logit"))
  mod2 <- glm(conserv ~ woman + class, data = data, family = binomial(link = "logit"))
  # compute the marginal effects in both models
  mfx1 <- avg_comparisons(mod1, variables = "woman", vcov = FALSE)
  mfx2 <- avg_comparisons(mod2, variables = "woman", vcov = FALSE)
  # compare the two models and store the difference in the estimate column
  # estimator() must return a marginaleffects object
  mfx1$estimate <- mfx1$estimate - mfx2$estimate
  return(mfx1)
}

estimator(gss)

 Estimate
  0.00636

Term: woman
Type: response
Comparison: Women - Men

Finally, we use the inferences() function to bootstrap the while procedure and obtain confidence intervals.

inferences(gss, method = "rsample", estimator = estimator)

 Estimate 2.5 % 97.5 %
  0.00636 0.002 0.0108

Term: woman
Type: response
Comparison: Women - Men

The number reported in the Estimate column represents the difference between the marginal effect of woman in the first model and the marginal effect of woman in the second model. The confidence interval represents the uncertainty around this difference.

38.3 Categorical Dependent Variable

To show how the concepts developed in the previous sections can be applied to a categorical dependent variable, we fit a multinomial logit model to the health variable. Our focal predictor is a binary variable called college.

table(gss$college, gss$health)
                
                 excellent good fair poor
  No Col Degree        287 1068  498  108
  College Degree       457  967  191   24
mod <- multinom(health ~ college + race4 + class + woman, data = gss, trace = FALSE)

avg_comparisons(mod, variables = c("college", "woman"))

    Term     Group                       Contrast Estimate Std. Error      z Pr(>|z|)    S    2.5 %   97.5 %
 college excellent College Degree - No Col Degree  0.08666    0.01504  5.761   <0.001 26.8  0.05718  0.11614
 college good      College Degree - No Col Degree  0.02378    0.01833  1.298   0.1945  2.4 -0.01214  0.05971
 college fair      College Degree - No Col Degree -0.08359    0.01425 -5.865   <0.001 27.7 -0.11152 -0.05565
 college poor      College Degree - No Col Degree -0.02686    0.00664 -4.043   <0.001 14.2 -0.03987 -0.01384
 woman   excellent Women - Men                     0.00129    0.01330  0.097   0.9227  0.1 -0.02478  0.02736
 woman   good      Women - Men                     0.01181    0.01656  0.713   0.4758  1.1 -0.02065  0.04427
 woman   fair      Women - Men                    -0.02372    0.01301 -1.824   0.0682  3.9 -0.04922  0.00177
 woman   poor      Women - Men                     0.01063    0.00610  1.743   0.0813  3.6 -0.00132  0.02257

Type: probs

The results in the Estimate column show the estimate marginal effect of getting a college degree on the probability of belonging to each of the health categories. Clearly, having a college degree seems to increase the likelihood that individuals will report being in good or excellent health.

We can compute the average marginal effect inequality for the college predictor and woman predictors as before, using the hypothesis argument.

avg_comparisons(mod, 
  variables = c("college", "woman"), 
  hypothesis = ~ I(mean(abs(x))) | term)

    Term Estimate Std. Error    z Pr(>|z|)    S     2.5 % 97.5 %
 college   0.0552    0.00752 7.35   <0.001 42.2  0.040490 0.0700
 woman     0.0119    0.00650 1.82   0.0682  3.9 -0.000887 0.0246

Type: probs

And we can compare the two estimates by piping the previous command to the hypotheses() function.

avg_comparisons(mod, 
  variables = c("college", "woman"), 
  hypothesis = ~ I(mean(abs(x))) | term) |>
  hypotheses(~revpairwise)

          Hypothesis Estimate Std. Error    z Pr(>|z|)    S  2.5 % 97.5 %
 (college) - (woman)   0.0434    0.00992 4.37   <0.001 16.3 0.0239 0.0628

It seems that college has a larger effect on health—on average across outcome levels—than woman.


  1. We set vcov=FALSE for efficiency because we will bootstrap the full procedure.↩︎