# Slopes

# Definition and motivation

Slopes are defined as:

Partial derivatives of the regression equation with respect to a regressor of interest. a.k.a. Marginal effects, trends, velocity.

In some disciplines, like economics and political science, slopes are called “marginal effects,” where marginal refers to the idea of a “small change,” in the calculus sense. In other disciplines, they are called “trends” or “velocity.” In this vignette, we use these terms interchageably.

A slope measures the association between a change in a regressor \(x\), and a change in the response \(y\), estimated at a specific values of the regressor \(x\). Slopes are extremely useful, because they are intuitive and easy to interpret. They are often the main quantity of interest in an empirical analysis.

In scientific practice, slopes fall in the same toolbox as the “Contrast.” Both try to answer a counterfactual question: What would happen to \(y\) if \(x\) were different? They allow us to model the “effect” of a change in the explanator \(x\) on the response \(y\).^{1}

To illustrate the concept, consider this quadratic function:

\[y = -x^2\]

From the definition above, we know that the marginal effect is the partial derivative of \(y\) with respect to \(x\):

\[\frac{\partial y}{\partial x} = -2x\]

To get intuition about how to interpret this quantity, consider the response of \(y\) to \(x\). It looks like this:

When \(x\) increases, \(y\) starts to increase. But then, as \(x\) increases further, \(y\) creeps back down in negative territory.

A marginal effect is the slope of this response function at a certain value of \(x\). The next plot adds three tangent lines, highlighting the slopes of the response function for three values of \(x\). The slopes of these tangents tell us three things:

- When \(x<0\), the slope is positive: an increase in \(x\) is associated with an increase in \(y\): The marginal effect is positive.
- When \(x=0\), the slope is null: a (small) change in \(x\) is associated with no change in \(y\). The marginal effect is null.
- When \(x>0\), the slope is negative: an increase in \(x\) is associated with a decrease in \(y\). The marginal effect is negative.

Above, we considered how marginal effects can be computed analytically in a simple quadratic equation context. We can now use the `slopes`

function to replicate our analysis of the quadratic function in a regression application.

Say you estimate a linear regression model with a quadratic term:

\[Y = \beta_0 + \beta_1 X^2 + \varepsilon\]

and obtain estimates of \(\beta_0=1\) and \(\beta_1=2\). Taking the partial derivative with respect to \(X\) and plugging in our estimates gives us the marginal effect of \(X\) on \(Y\):

\[\partial Y / \partial X = \beta_0 + 2 \cdot \beta_1 X\] \[\partial Y / \partial X = 1 + 4X\]

This result suggests that the effect of a *change* in \(X\) on \(Y\) depends on the *level* of \(X\). When \(X\) is large and positive, an increase in \(X\) is associated to a large increase in \(Y\). When \(X\) is small and positive, an increase in \(X\) is associated to a small increase in \(Y\). When \(X\) is a large negative value, an increase in \(X\) is associated with a *decrease* in \(Y\).

`marginaleffects`

arrives at the same conclusion in simulated data:

```
library(tidyverse)
N <- 1e5
quad <- data.frame(x = rnorm(N))
quad$y <- 1 + 1 * quad$x + 2 * quad$x^2 + rnorm(N)
mod_quad <- lm(y ~ x + I(x^2), quad)
slopes(mod_quad, newdata = datagrid(x = -2:2)) |>
mutate(truth = 1 + 4 * x) |>
select(estimate, truth)
#>
#> Estimate
#> -6.98
#> -2.98
#> 1.01
#> 5.00
#> 8.99
#>
#> Columns: estimate, truth
```

We can plot conditional adjusted predictions with `plot_predictions`

function:

`plot_predictions(mod_quad, condition = "x")`

We can plot conditional marginal effects with the `plot_slopes`

function (see section below):

`plot_slopes(mod_quad, variables = "x", condition = "x")`

Again, the conclusion is the same. When \(x<0\), an increase in \(x\) is associated with an decrease in \(y\). When \(x>1/4\), the marginal effect is positive, which suggests that an increase in \(x\) is associated with an increase in \(y\).

Slopes are *unit-level* measures of association between changes in a regressor and changes in the response. Except in the simplest linear models, the value of the marginal effect will be different from individual to individual, because it will depend on the values of the other covariates for each individual.

By default, the `slopes()`

function thus produces distinct estimates of the marginal effect for each row of the data used to fit the model:

```
mfx <- slopes(mod_quad)
dim(mfx)
#> [1] 100000 14
head(mfx)
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> -1.7895 0.00447 -400.22 <0.001 Inf -1.7983 -1.7807
#> -3.4719 0.00596 -582.28 <0.001 Inf -3.4836 -3.4602
#> 5.7850 0.00628 921.65 <0.001 Inf 5.7727 5.7973
#> 0.0321 0.00336 9.55 <0.001 69.4 0.0255 0.0387
#> 4.2220 0.00484 872.72 <0.001 Inf 4.2125 4.2315
#> -1.7168 0.00441 -388.95 <0.001 Inf -1.7254 -1.7081
#>
#> Term: x
#> Type: response
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, y, x
mfx
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> -1.7895 0.00447 -400.22 <0.001 Inf -1.7983 -1.7807
#> -3.4719 0.00596 -582.28 <0.001 Inf -3.4836 -3.4602
#> 5.7850 0.00628 921.65 <0.001 Inf 5.7727 5.7973
#> 0.0321 0.00336 9.55 <0.001 69.4 0.0255 0.0387
#> 4.2220 0.00484 872.72 <0.001 Inf 4.2125 4.2315
#> --- 99990 rows omitted. See ?avg_slopes and ?print.marginaleffects ---
#> 3.2465 0.00407 797.26 <0.001 Inf 3.2385 3.2545
#> 7.1441 0.00764 935.10 <0.001 Inf 7.1291 7.1591
#> 0.3796 0.00325 116.65 <0.001 Inf 0.3732 0.3860
#> 3.1442 0.00400 785.86 <0.001 Inf 3.1363 3.1520
#> 5.1190 0.00564 907.47 <0.001 Inf 5.1079 5.1300
#> Term: x
#> Type: response
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, y, x
```

# The Marginal Effects Zoo

Analysts often use the term “marginal effects” to refer to slopes. And since slopes are conditional unit-level quantities, they can be aggregated and reported in various ways. This gave rise to nomenclature that can be confusion to newcomers.

In this section, we show how the conceptual framework introduced in the “Get Started”, the “Predictions”, and the “Comparisons” vignettes can help us understand these different quantities. Throughout, we lean on the five pillars:

- Quantity of interest
- Grid
- Aggregation
- Uncertainty
- Test

This allows us to understand various estimates described in Andrew Heiss’ characteristically excellent blog post::

- Average Marginal Effects
- Group-Average Marginal Effects
- Marginal Effects at User-Specified Values (or Representative Values)
- Marginal Effects at the Mean
- Counterfactual Marginal Effects
- Conditional Marginal Effects

Throughout this section, we will illustrate various concepts using a logistic regression model fit on data from the Palmer Penguins dataset:

```
library(marginaleffects)
dat <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")
dat$large_penguin <- ifelse(dat$body_mass_g > median(dat$body_mass_g, na.rm = TRUE), 1, 0)
mod <- glm(large_penguin ~ bill_length_mm * flipper_length_mm + species,
data = dat, family = binomial)
```

## Average Marginal Effect (AME)

A dataset with one marginal effect estimate per unit of observation is a bit unwieldy and difficult to interpret. Many analysts like to report the “Average Marginal Effect”, that is, the average of all the observation-specific marginal effects. These are easy to compute based on the full `data.frame`

shown above, but the `avg_slopes()`

function is convenient:

```
avg_slopes(mod)
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> bill_length_mm mean(dY/dX) 0.0279 0.00595 4.685 <0.001 18.4 0.01621 0.0395
#> flipper_length_mm mean(dY/dX) 0.0105 0.00237 4.425 <0.001 16.7 0.00585 0.0151
#> species mean(Chinstrap) - mean(Adelie) -0.4128 0.05600 -7.371 <0.001 42.4 -0.52255 -0.3030
#> species mean(Gentoo) - mean(Adelie) 0.0609 0.10734 0.568 0.57 0.8 -0.14946 0.2713
#>
#> Type: response
#> Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
```

Note that since marginal effects are derivatives, they are only properly defined for continuous numeric variables. When the model also includes categorical regressors, the `summary`

function will try to display relevant (regression-adjusted) contrasts between different categories, as shown above.

## Group-Average Marginal Effect (G-AME)

We can also use the `by`

argument the average marginal effects *within different subgroups* of the observed data, based on values of the regressors. For example, to compute the average marginal effects of Bill Length for each Species, we do:

```
mfx <- avg_slopes(
mod,
by = "species",
variables = "bill_length_mm")
mfx
#>
#> species Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> Adelie 0.04392 0.00904 4.860 <0.001 19.7 0.02620 0.0616
#> Chinstrap 0.03704 0.00970 3.818 <0.001 12.9 0.01803 0.0561
#> Gentoo 0.00307 0.00313 0.981 0.327 1.6 -0.00306 0.0092
#>
#> Term: bill_length_mm
#> Type: response
#> Comparison: mean(dY/dX)
#> Columns: term, contrast, species, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
```

This is equivalent to manually taking the mean of the observation-level marginal effect for each species sub-group:

Note that `marginaleffects`

follows `Stata`

and the `margins`

package in computing standard errors using the group-wise averaged Jacobian.

## Marginal Effect at User-Specified Values

Sometimes, we are not interested in *all* the unit-specific marginal effects, but would rather look at the estimated marginal effects for certain “typical” individuals, or for user-specified values of the regressors. The `datagrid`

function helps us build a data grid full of “typical” rows. For example, to generate artificial Adelies and Gentoos with 180mm flippers:

The same command can be used (omitting the `model`

argument) to `marginaleffects`

’s `newdata`

argument to compute marginal effects for those (fictional) individuals:

```
slopes(
mod,
newdata = datagrid(
flipper_length_mm = 180,
species = c("Adelie", "Gentoo")))
#>
#> Term Contrast flipper_length_mm species Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> bill_length_mm dY/dX 180 Adelie 0.0578 0.03496 1.652 0.0986 3.3 -0.01078 1.26e-01
#> bill_length_mm dY/dX 180 Gentoo 0.0804 0.04311 1.864 0.0623 4.0 -0.00413 1.65e-01
#> flipper_length_mm dY/dX 180 Adelie 0.0233 0.00555 4.191 <0.001 15.1 0.01238 3.41e-02
#> flipper_length_mm dY/dX 180 Gentoo 0.0324 0.00880 3.677 <0.001 12.1 0.01512 4.96e-02
#> species Chinstrap - Adelie 180 Adelie -0.2106 0.10748 -1.959 0.0501 4.3 -0.42121 9.48e-05
#> species Chinstrap - Adelie 180 Gentoo -0.2106 0.10748 -1.959 0.0501 4.3 -0.42121 9.48e-05
#> species Gentoo - Adelie 180 Adelie 0.1556 0.30082 0.517 0.6051 0.7 -0.43403 7.45e-01
#> species Gentoo - Adelie 180 Gentoo 0.1556 0.30082 0.517 0.6051 0.7 -0.43403 7.45e-01
#>
#> Type: response
#> Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, flipper_length_mm, species, predicted_lo, predicted_hi, predicted, bill_length_mm, large_penguin
```

When variables are omitted from the `datagrid`

call, they will automatically be set at their mean or mode (depending on variable type).

## Marginal Effect at the Mean (MEM)

The “Marginal Effect at the Mean” is a marginal effect calculated for a hypothetical observation where each regressor is set at its mean or mode. By default, the `datagrid`

function that we used in the previous section sets all regressors to their means or modes. To calculate the MEM, we can set the `newdata`

argument, which determines the values of predictors at which we want to compute marginal effects:

```
slopes(mod, newdata = "mean")
#>
#> Term Contrast Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> bill_length_mm dY/dX 0.0519 0.01447 3.585 <0.001 11.5 0.02351 0.0803
#> flipper_length_mm dY/dX 0.0193 0.00558 3.463 <0.001 10.9 0.00838 0.0303
#> species Chinstrap - Adelie -0.8088 0.07663 -10.554 <0.001 84.1 -0.95897 -0.6586
#> species Gentoo - Adelie 0.0818 0.11540 0.709 0.478 1.1 -0.14435 0.3080
#>
#> Type: response
#> Columns: rowid, term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, bill_length_mm, flipper_length_mm, species, large_penguin
```

## Counterfactual Marginal Effects

The `datagrid`

function allowed us look at completely fictional individuals. Setting the `grid_type`

argument of this function to `"counterfactual"`

lets us compute the marginal effects for the actual observations in our dataset, but with a few manipulated values. For example, this code will create a `data.frame`

twice as long as the original `dat`

, where each observation is repeated with different values of the `flipper_length_mm`

variable:

We see that the rows 1, 2, and 3 of the original dataset have been replicated twice, with different values of the `flipper_length_mm`

variable:

```
nd[nd$rowid %in% 1:3,]
#> rowidcf large_penguin bill_length_mm species flipper_length_mm
#> 1 1 0 39.1 Adelie 160
#> 2 2 0 39.5 Adelie 160
#> 3 3 0 40.3 Adelie 160
#> 343 1 0 39.1 Adelie 180
#> 344 2 0 39.5 Adelie 180
#> 345 3 0 40.3 Adelie 180
```

We can use the observation-level marginal effects to compute average (or median, or anything else) marginal effects over the counterfactual individuals:

## Conditional Marginal Effects (Plot)

The `plot_slopes`

function can be used to draw “Conditional Marginal Effects.” This is useful when a model includes interaction terms and we want to plot how the marginal effect of a variable changes as the value of a “condition” (or “moderator”) variable changes:

```
mod <- lm(mpg ~ hp * wt + drat, data = mtcars)
plot_slopes(mod, variables = "hp", condition = "wt")
```

The marginal effects in the plot above were computed with values of all regressors – except the `variables`

and the `condition`

– held at their means or modes, depending on variable type.

Since `plot_slopes()`

produces a `ggplot2`

object, it is easy to customize. For example:

```
plot_slopes(mod, variables = "hp", condition = "wt") +
geom_rug(aes(x = wt), data = mtcars) +
theme_classic()
```

## Slopes vs Predictions: A Visual Interpretation

Often, analysts will plot predicted values of the outcome with a best fit line:

```
library(ggplot2)
mod <- lm(mpg ~ hp * qsec, data = mtcars)
plot_predictions(mod, condition = "hp", vcov = TRUE) +
geom_point(data = mtcars, aes(hp, mpg))
```

The slope of this line is calculated using the same technique we all learned in grade school: dividing rise over run.

```
p <- plot_predictions(mod, condition = "hp", vcov = TRUE, draw = FALSE)
plot_predictions(mod, condition = "hp", vcov = TRUE) +
geom_segment(aes(x = p$hp[10], xend = p$hp[10], y = p$estimate[10], yend = p$estimate[20])) +
geom_segment(aes(x = p$hp[10], xend = p$hp[20], y = p$estimate[20], yend = p$estimate[20])) +
annotate("text", label = "Rise", y = 10, x = 140) +
annotate("text", label = "Run", y = 2, x = 200)
```

Instead of computing this slope manually, we can just call:

```
avg_slopes(mod, variables = "hp")
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> -0.112 0.0126 -8.92 <0.001 61.0 -0.137 -0.0874
#>
#> Term: hp
#> Type: response
#> Comparison: mean(dY/dX)
#> Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
```

Now, consider the fact that our model includes an interaction between `hp`

and `qsec`

. This means that the slope will actually differ based on the value of the moderator variable `qsec`

:

`plot_predictions(mod, condition = list("hp", "qsec" = "quartile"))`

We can estimate the slopes of these three fit lines easily:

```
slopes(
mod,
variables = "hp",
newdata = datagrid(qsec = quantile(mtcars$qsec, probs = c(.25, .5, .75))))
#>
#> qsec Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> 16.9 -0.0934 0.0111 -8.43 <0.001 54.7 -0.115 -0.0717
#> 17.7 -0.1093 0.0123 -8.92 <0.001 60.9 -0.133 -0.0853
#> 18.9 -0.1325 0.0154 -8.61 <0.001 56.9 -0.163 -0.1023
#>
#> Term: hp
#> Type: response
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, qsec, predicted_lo, predicted_hi, predicted, hp, mpg
```

As we see in the graph, all three slopes are negative, but the Q3 slope is steepest.

We could then push this one step further, and measure the slope of `mpg`

with respect to `hp`

, *for all observed values* of `qsec`

. This is achieved with the `plot_slopes()`

function:

```
plot_slopes(mod, variables = "hp", condition = "qsec") +
geom_hline(yintercept = 0, linetype = 3)
```

This plot shows that the marginal effect of `hp`

on `mpg`

is always negative (the slope is always below zero), and that this effect becomes even more negative as `qsec`

increases.

# Minimum velocity

In some fields such as epidemiology, it is common to compute the minimum or maximum slope, as a measure of the “velocity” of the response function. To achieve this, we can use the `comparisons()`

function and define custom functions. Here, we present a quick example without much detail, but you can refer to the comparisons vignette for more explanations on custom functions in the `comparison`

argument.

Consider this simple GAM model:

```
library(marginaleffects)
library(itsadug)
library(mgcv)
simdat$Subject <- as.factor(simdat$Subject)
model <- bam(Y ~ Group + s(Time, by = Group) + s(Subject, bs = "re"),
data = simdat)
plot_slopes(model, variables = "Time", condition = c("Time", "Group"))
```

Minimum velocity, overall:

```
comparisons(model,
variables = list("Time" = 1e-6),
vcov = FALSE,
comparison = \(hi, lo) min((hi - lo) / 1e-6))
#>
#> Estimate
#> -0.0267
#>
#> Term: Time
#> Type: response
#> Comparison: +1e-06
#> Columns: term, contrast, estimate, predicted_lo, predicted_hi, predicted
```

Minimum velocity by group:

```
comparisons(model,
variables = list("Time" = 1e-6),
by = "Group",
vcov = FALSE,
comparison = \(hi, lo) min((hi - lo) / 1e-6))
#>
#> Group Estimate
#> Children -0.0153
#> Adults -0.0267
#>
#> Term: Time
#> Type: response
#> Comparison: +1e-06
#> Columns: term, contrast, Group, estimate
```

Difference between the minimum velocity of each group:

```
comparisons(model,
variables = list("Time" = 1e-6),
vcov = FALSE,
by = "Group",
hypothesis = "pairwise",
comparison = \(hi, lo) min((hi - lo) / 1e-6))
#>
#> Estimate
#> 0.0114
#>
#> Term: Children - Adults
#> Type: response
#> Columns: term, estimate
```

What `Time`

value corresponds to the minimum velocity?

```
low = function(hi, lo, x) {
dydx <- (hi - lo) / 1e-6
dydx_min <- min(dydx)
x[dydx == dydx_min][1]
}
comparisons(model,
variables = list("Time" = 1e-6),
vcov = FALSE,
by = "Group",
comparison = low)
#>
#> Group Estimate
#> Children 1313
#> Adults 1556
#>
#> Term: Time
#> Type: response
#> Comparison: +1e-06
#> Columns: term, contrast, Group, estimate
```

# Manual computation

Now we illustrate how to reproduce the output of `slopes`

manually:

```
library(marginaleffects)
mod <- glm(am ~ hp, family = binomial, data = mtcars)
eps <- 1e-4
d1 <- transform(mtcars, hp = hp - eps / 2)
d2 <- transform(mtcars, hp = hp + eps / 2)
p1 <- predict(mod, type = "response", newdata = d1)
p2 <- predict(mod, type = "response", newdata = d2)
s <- (p2 - p1) / eps
tail(s)
#> Porsche 914-2 Lotus Europa Ford Pantera L Ferrari Dino Maserati Bora Volvo 142E
#> -0.0020285496 -0.0020192814 -0.0013143243 -0.0018326764 -0.0008900012 -0.0020233577
```

Which is equivalent to:

```
slopes(mod, eps = eps) |> tail()
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> -0.00203 0.001347 -1.51 0.13199 2.9 -0.00467 0.000611
#> -0.00202 0.001543 -1.31 0.19068 2.4 -0.00504 0.001005
#> -0.00131 0.000452 -2.91 0.00360 8.1 -0.00220 -0.000429
#> -0.00183 0.001295 -1.41 0.15713 2.7 -0.00437 0.000706
#> -0.00089 0.000277 -3.21 0.00132 9.6 -0.00143 -0.000347
#> -0.00202 0.001571 -1.29 0.19776 2.3 -0.00510 0.001056
#>
#> Term: hp
#> Type: response
#> Columns: rowid, term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted, am, hp
```

And we can get average marginal effects by subgroup as follows:

```
tapply(s, mtcars$cyl, mean)
#> 4 6 8
#> -0.002010526 -0.001990774 -0.001632681
slopes(mod, eps = eps, by = "cyl")
#>
#> cyl Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
#> 4 -0.00201 0.001408 -1.43 0.1534 2.7 -0.00477 0.000749
#> 6 -0.00199 0.001514 -1.32 0.1885 2.4 -0.00496 0.000976
#> 8 -0.00163 0.000955 -1.71 0.0873 3.5 -0.00350 0.000239
#>
#> Term: hp
#> Type: response
#> Comparison: mean(dY/dX)
#> Columns: term, contrast, cyl, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, predicted_lo, predicted_hi, predicted
```

## Footnotes

The term “effect” is tricky. To be clear, this vignette does

*not*use the word “effect” to imply “causality”.↩︎