In an ordinary least squares (OLS) regression model, the marginal effect of an independent variable on the dependent variable is simply the regression coefficient estimate reported by the statistical software package. Assume a simple model where y is regressed on x, x takes on values from 1 to 100, and the regression parameter estimate for Beta_1 is 2 (i.e., y= B0 + B1x + e, where B1=2). What OLS has given is an average marginal effect across all the values of x. It doesn’t matter if we are predicting y using an x value of 1 or an x value of 100. We will use the constant, average marginal effect of 2 times the value of x to predict y in this simple model.

Various model specifications and functional forms may be used to relax this assumption. For example, specifying the model with ln(y) rather than y estimates a constant **percentage** change in y per change in x, including an interaction term in a multivariate model, or including x and x^2 in a multivariate model all allow for non-constant marginal effects.

Returning to the simple OLS model, the marginal effect of x on y is a derivative. The model tells us what a one unit change in x does to y. Since y= B0 + B1x +e, dy/dx = B1. However, for probit and logit models we can’t simply look at the regression coefficient estimate and immediately know what the marginal effect of a one unit change in x does to y. These are nonlinear models where various values of x have different marginal effects on y. In the example above where x goes from 1 to 100 the impact on y when x equals 1 will be different than the impact on y when x equals 100. This is completely different than the simple OLS example where the underlying values of x did not matter and the marginal effect of x on y was always 2.

The sign of the impact x has on y is known by looking at the statistical software package output for probit and logit models, but the marginal effect is not. The coefficient estimate is important, but it is only one piece of the marginal effect. This is because the probit model uses the cumulative distribution function (CDF) of the standard normal distribution evaluated at the predicted value of y (i.e., B0 + B1x1, and this is commonly referred to as “**XB**” in econometrics texts), and the logit model uses the cumulative distribution function (CDF) of the standard logistic distribution evaluated at the predicted value of y (i.e., B0 + B1x1, or **XB** ). Calculating the same derivative for a probit or logit model, dy/dx now uses the chain rule from calculus. The derivative of the CDF of the relevant distribution evaluated at **XB** is 1) the probability density function (PDF) of the relevant distribution (standard normal or standard logistic) at **XB **times 2) the derivative of **XB** with respect to x, which is the regression coefficient estimate Beta_1 in this simple example. Note that the PDF is the derivative of the CDF for the first part of the derivative and the second part of the derivative come froms the chain rule.

For example, in the case of the probit model, the marginal effect of x on y is the probability distribution function (PDF) of the standard normal distribution multiplied by Beta_1. What follows is a Stata .do file that does the following for both probit and logit models: 1) illustrates that the coefficient estimate is not the marginal effect 2) calculates the predicted probability “by hand” based on **XB** 3) calculates the marginal effect at the mean of x “by hand” and 4) calculates the mean marginal effect of x “by hand.”