

(1)  (2)  
OLS  Probit  
main  
x  0.218^{***}  0.209^{***} 
(4.99)  (4.88)  
d  0.237^{***}  0.247^{***} 
(8.81)  (11.67)  


N  1000  1000 


Sure enough, it doesn’t matter in this case whether you report OLS estimates or marginal effects from a nonlinear model. Your findings are essentially the same.
Now suppose you want to see if the effect of x on y varies across groups defined by the dummy variable d. You create the interaction of x and d, denoted xd, and perform the exercise above again:


(1)  (2)  
OLS  probit  


main  
x  0.290^{***}  0.216^{***} 
(5.53)  (4.57)  
d  0.365^{***}  0.262^{***} 
(6.24)  (5.44)  
xd  0.231^{*}  0.044 
(2.46)  (0.35)  


N  1000  1000 


Now you have a problem: the OLS estimate tells you that the effect of x on y is smaller when d=1 than when d=0 (b=0.23, t=2.46), but the probit estimates suggest that the effect of x on y is pretty stable as we vary d (b=0.04, t=0.35). What’s going on?
The issue is that the coefficients on the interaction term demand different interpretations in linear and nonlinear models. The OLS model is
so
(where for simplicity I’ve ignored d’s binary nature). However in the probit model
where denotes the standard normal CDF. Therefore,
But this is not what we want to evaluate! This term is the marginal effect of the interaction term (xd), but what we want is the interaction effect:
which is a more complicated expression not generally equal to . If you run a nonlinear model including an interaction term, the marginal effect of the interaction term is not the cross–effect of the variables you’ve interacted. The two expressions are the same only in linear models, which means this distinction is irrelevant for the OLS model.
In the example above, the data are artificial, and the true value of the interaction term in the index function () is zero. We can see what the OLS estimate recovers as a nonzero interaction by graphing predicted probabilities in the d=0 and d=1 regimes:
The slope is shallower for the d=1 group than for the d=0 group even though there is no structural interaction effect because the d=1 group has high probability of y=1 everywhere. Consider a more extreme case in which the probability that y=1 in the d=1 group increases with x, but from 0.99 to 0.999 as x varies from its lowest to highest value: then the slope must be only slightly above zero everywhere no matter how large an effect x has on the index function.
Chunrong Ai and Edward Norton (2003) discuss this issue, and the authors also wrote new Stata commands to numerically calculate the correct interaction effects. However, Bill Greene points out in a recent paper that perhaps we don’t want to evaluate the interaction effect on the conditional probability, either. Greene argues that nonlinearities imply that the observed interaction effect will generally be nonzero even when the true coefficient on the interaction term in the index function is zero (as illustrated above), such that observed interactions may be artifacts of functional form specifications. A given structural model can generate all sorts of observed interaction effects depending on the properties of the data: for example, in the graph above, the slopes of the two lines would become more similar if we sampled x’s much larger than one, the highest x in the artificial data. For these reasons Greene suggests that the sort of tests econometricians have been reporting following Ai and Norton (2003) are not meaningful. Instead, Greene argues we should conduct hypothesis tests only on the structural parameters of the model, such as the difficulttointerpret probit coefficients, and not on implications of the model such as marginal effects.
See also Puhani (2008), who points out that some of these issues are irrelevant in difference in difference models in which the interaction term of interest is also a treatment dummy.
]]>
THOU SHALT USE .DO FILES. No, don’t argue, just do it, even for your exploratory data analysis. You will find that doing everything in .do files is much faster than working from the command line (or, even worse, the dropdown menus), and you automatically document your work. Also, make sure you adequately comment your .do files. That convoluted code you just wrote makes perfect sense to you right now, but will you remember what it does when you come back to this project a week from now, or a year from now?
Your goal while writing your .do file is to keep your code clean, easily readable, and efficient. Writing decent code will speed up your work and minimize the chances that a coding error will affect your results. You want to set up your code in such a way that you can easily make changes to your specification and see how your results are affected. And you want to produce output that looks good right out of Stata so you don’t have to do a lot of work writing up your tables.
Suppose you’ve got your data all cleaned following steps such as those Frances lays out. For this example we’ll use the Mroz wage data as presented by Jeff Wooldridge, available online, and we’ll estimate some log wage models. We’ll start the .do file by setting some options:
*** Sample .do file, Chris Auld
*** last modified: October 4, 2011*** Preliminaries.
# delimit ;
drop _all ;
clear matrix ;
capture log close ;
log using cchsexample, replace ;*** Read cleaned data, show summary statistics ;
use http://fmwww.bc.edu/ecp/data/wooldridge/MROZ ;
It’s a good idea to use a character to mark the end of command lines. Otherwise you can’t wrap long commands over multiple lines, which makes your code hard to read. “#delimit ;” tells Stata to interpret the semicolon as indicating the end of a command. (The downside is 50% of your syntax errors will henceforth be the result of forgetting a semicolon.) Then we nuke any data or matrixes Stata may have in memory when you start the program with the drop _all and clear matrix commands. If you like, also set memory, scheme, and other options here.
We want to keep a record of the output. capture log close tells Stata to stop logging if it is logging, and not to halt on an error if it’s not logging. The next line tells Stata to write all output to a file. And then we read the alreadycleaned data.
Now let’s define the sets of variables we’re going to use in the analysis. Assume you want to vary the set of covariates and see how your estimates change, and you want to see whether the estimator you choose has a substantive effect on your results. You want to produce a nice looking table of estimates from various specifications. For this example, suppose you want to compare results when you do and do not control for husband’s characteristics, and you want to compare results from OLS and median regressions (implemented with the command qreg) of the same specifications.
*** Define sets of variables ;
local demographic “city age educ” ;
local husbandchars “hushrs husage” ;
local allcovars “`demographic’ `husbandchars'” ;
local estimators “regress qreg” ;
We’ve stuffed strings into four new local variables. demographic and husbandchars contain sets of righthand side covariates we wish to include or exclude in various models, and the sets are collected in another set allcovars. If you want to add a new variable to one of these sets, it will automatically be included in all subsequent estimation or other commands which reference that set, so you never have to go through your code clumsily adding or removing variables. Further, your code will be much readable—imagine a real project with many of dozens of variables instead of a handful. Finally, the local estimators lists the estimation commands we want to try. Want to try all your specifications with some other estimator too? Just add it to this list.
Now is a good time to generate your descriptive statistics (and remember: all good papers display descriptive statistics) and graphs. We’ll just ask for plain summary statistics for this example. If you are exporting your output to LaTeX, you can use the command sutex instead of summarize and you’ll get the output in the form of .tex code.
*** Summary statistics ;
summ wage `allcovars’ ;
And finally we’ll estimate some models. To make the log file more readable we’ll ask Stata to suppress output while running models by wrapping everything using the quietly command. We’ll loop over each of the estimators we specified above, and save all of the results.
*** Estimate regression models. ;
quietly { ;
foreach estimator of local estimators { ;`estimator’ lwage `demographic’ ;
estimates store `estimator’m1 ;
`estimator’ lwage `demographic’ `husbandchars’ ;
estimates store `estimator’m2 ;
};
Given what we placed in the macros, after this loop executes we will have four sets of estimates in memory: regressm1, regressm2, qregm1, and qregm2. If we want one table to display all these results, we can use:
esttab * , b(%8.3f) t(%7.2f) stats(N r2)
ti(“Log wage OLS and median regression estimates”)
booktabs ;
This command tells Stata to make a table of all (*) the estimates it has saved. Since we’re economists, we want coefficients (to three decimal places) and tratios (to two decimal places) rather than standard errors. We tell Stata to report the number of observations used and the R2 from the model if it’s available. Finally, the option booktabs tells Stata to write the results as .tex code, although there are other options which make it easy to export the output to Word or as .html or several other formats.
Running this code produces a table which looks like this (with ten seconds of editing in .tex to add the “OLS” and “median” labels):
( Click here to see a high resolution version. )
]]>