A non-technical guide to the dueling Seattle minimum wage studies

Over the past couple of weeks two studies of the effects of Seattle’s recent minimum wage hikes have been released: one from some researchers at the University of Washington, the other from researchers at Berkeley.  The results have widely been interpreted, including by the authors of one of the studies, as wildly inconsistent.  This post presents a non-technical guide to the problems associated with attempting to estimate the effects of the minimum wage, the statistical methods used in both papers, and how to interpret the results (spoiler: the results are actually not in conflict).

The scientific problem.

Consider an analogy.  Suppose we wanted to figure out whether some new pharmaceutical is effective at treating some health condition.  What we would do is take a large group of people, randomize some to get the new drug, and compare the outcomes of the people who randomly get the drug to those of people who randomly don’t get the drug.  In this randomized controlled trial, we would like to use as large a sample as possible so as to average out the unfathomably complex causes of any given person’s health, and we randomize who gets the drug to rule out the possibility that the underlying determinants of who gets the drug also determine the health outcomes.  For example, if we let people choose whether to take the drug and then compare the outcomes of those who choose to take it to those who don’t, it would misleadingly look like the drug caused health to fall if only desperate people in very low health chose to take it.

Now apply this reasoning to a policy change such as hiking the minimum wage.  We’d like to evaluate such a policy by taking a very large group of cities, randomizing some cities to get higher minimum wages and others not, and then compare the outcomes of interest, such wages and employment, across those two groups.  We’d get an estimate of the average effect of the minimum wage hike which is, on average, correct.

Unfortunately, no such large-scale controlled experiment is available.  And unfortunately, if we ask a very specific question like, what was the effect of the 2015/16 increases in minimum wages on employment in Seattle?’, we’re not asking about how minimum wages affect the average city, we’re asking about this particular case.  This is analogous to asking not, does this drug benefit the average patient’ acknowledging some patients may be helped more than others and some may actually be harmed.  It’s like asking how did this drug affect Fred Jones, specifically, Fred Jones?’  The sample size is, in a sense, n=1.  Further, we face the problem that Seattle wasn’t randomly chosen as a city to increase minimum wages, so it’s as if Fred Jones was standing in a large group of people, was the only person to stick his hand up when the research team asked for volunteers to test some new drug, and we want to know the effect of that drug on Fred Jones.

There is no statistical magic which can fully overcome these fundamental problems.  We will never be able to “prove” what the effect of the minimum wage was: that’s not the way statistics work in general, and in a case study like what was the effect of the 2015 increase in minimum wages on employment in Seattle?’ the best we can hope for is to bring some suggestive evidence to the table.

How did the researchers attempt to estimate the effect of Seattle’s minimum wage, given these issues?  It may at first glance seem easy: we just look to see whether employment in Seattle went up or down after the minimum wage increased.  This intuition is what many skeptics have in mind when they critique the UW study’s finding of negative effects on employment: employment in Seattle rose during 2015 and 2016, so the UW study must be wrong, they claim.

The problem with this argument is that employment rises or falls for many reasons other than the level of the minimum wage.  Suppose for the sake of argument that Seattle actually created, say, 10,000 net low-wage jobs following the increase in the minimum wage.  The causal effect of the minimum wage on employment is not, then, +10,000, it’s 10,000 minus the number of jobs which would have been created if the minimum wage were not in place.  If 15,000 would have been created without the minimum wage hike and 10,000 were created with the hike, then the minimum wage hike actually destroyed 5,000 jobs, even though 10,000 jobs were created after the policy went into place.

To illustrate this point with the Seattle data, consider this graph:

In the absence of the minimum wage hike, the graph above might have looked more or less the same but with a slightly steeper increase in employment starting in 2015.  This possibility is illustrated on the graph by “counterfactual Seattle.”  The red line shows actual employment.  The blue line shows what might have happened if Seattle hadn’t increased the minimum wage (entirely made up, assuming a causal effect of the minimum wage of zero in the month implement rising linearly in magnitude to negative two percent two years later).  If we knew, somehow, that the real world was just like the graph, that is, we somehow knew that employment would have followed the blue path if counterfactually Seattle hadn’t hiked its minimum wage, we would conclude that the minimum wage hike decreased employment even though employment in our world actually rose following the hike.

Note this same argument applies when a minimum wage is imposed on an economy going into recession rather than booming, such as the 2007 Federal minimum wage hike in the U.S.: decreases in employment following a minimum wage hike tell us exactly nothing about the causal effect of the hike.

How did the research teams estimate the effects of minimum wage changes?

To attempt to overcome this problem, both teams of researchers try to find a “control group” of cities which did not hike their minimum wages.  The jargon “control group” highlights the methodological ties to the pharmaceutical analogy above: this is not a randomized controlled experiment, but we want to make it look as close as possible to a randomized controlled experiment using statistical methods.  Consider this graph of made-up data:

In the graph there are just two times: “before” and “after” the minimum wage hike.  The solid blue line shows what happens in the Seattle we see in the data, the real Seattle which actually hiked minimum wages.  Employment in real Seattle rises from 4 to 5 units.  We shouldn’t conclude that the minimum wage hike increased employment by one unit, however, because we see that in “some other city” (actually in the real studies: an average across many other cities in which there was no change in the minimum wage) which didn’t hike minimum wages, employment rose from 2 to 4 units.  We might reason: if Seattle had not hiked its minimum wage, its employment would have followed the same pattern as in cities which didn’t implement the hike.  If that were so, Seattle’s employment over time would have looked like the dashed blue line, it would have gone from 4 to 6 units rather than from 4 to 5.  In this case, one estimate of the effect of the minimum wage hike in Seattle is -1: employment rose in Seattle by 1 unit, but we guess it would have risen by 2 units if the minimum wage had not been hiked, so we estimate the minimum wage actually reduced employment by 1 unit despite the fact that employment in went up 1 unit after the hike.

A serious issue with this reasoning is that it assumes that the trends in employment in Seattle and the control city would have been the same but for the increase in minimum wages (this is called the “parallel trends” assumption).  But that’s not generally true.  Suppose that employment in Seattle would have flatlined in 2015 and 2016 if not for the minimum wage hike, even as employment increased in comparable other cities.  Then if the actual data looked like either of the graphs above, we’d conclude that the increase in minimum wages actually increased employment in Seattle.

Synthetic controls.

Both studies use a variant of the method described above which partially addresses this issue.  Instead of assuming that Seattle would have experienced a trend in employment equal to the trend in control cities if Seattle hadn’t increased its minimum wage, they assume that every city experiences common influences on employment over time, but each city may respond differently to those influences.  Suppose for example that every city’s employment goes up when the national economy booms, but that Seattle’s employment is much more sensitive to changes in the national economy than, say, Yakima’s.  Maybe Seattle’s employment rises 2% when the national employment rate rises 2%, but Yakima’s only goes up 1%.  This difference in sensitivity means these two cities would generally experience different trends in employment over time even if neither changes their minimum wage, and the statistical method described above would yield misleading results.  Both papers use a method called “synthetic controls” which addresses this concern, and somewhat weakens the assumption that all cities have the same trend in employment over time, replacing it with the assumption that the sensitivity of each city to common influences doesn’t change over time (the UW team also uses a closely related method called “interactive effects,” and finds very similar results).  The two papers use different sets of control cities, the UW paper used cities in Washington state whereas the Berkeley paper uses cities across the U.S.  It’s not obvious which strategy is better, and it is unfortunate that both teams didn’t report estimates from both strategies in the interest of robustness and comparability.

At the end of the day, in terms of fundamental assumptions, the synthetic control method is more or less the same as, and suffers from similar drawbacks to, the simpler method (called differences-in-differences) first described.  Basically, both papers contrast changes in employment in Seattle with changes in employment in other cities which didn’t change their minimum wages, and conclude the minimum wage destroyed jobs if Seattle’s employment fell relative to employment in other cities.  Neither paper can differentiate an effect of the minimum wage from Seattle-specific changes in the labor market in 2015 and 2016, and no paper will ever will.  Analogously, even if we see the single patient who took the new drug get better much faster than other patients, we can’t differentiate between two competing explanations: the drug is really effective, or something else happened to improve that patient’s health at about the same time he took the drug.

Do the results conflict?

The papers differ in the manner in which the construct their estimates.  The Berkeley team has quarterly data on total number of employees and total payroll and constructs the average weekly wage paid by dividing total payroll by total employees.  This data limitation makes it impossible to differentiate changes in average hours worked per worker from changes in wages per hour, and it means that the Berkeley team’s paper cannot observe how the distribution of wages changes (for example, how the proportion of workers making between $15 and$17 per hour changes), they can only work with the average wage.  The Berkeley team, following much of the literature, focuses on restaurant workers since many restaurant workers are paid at or near the minimum wage, so any effects of minimum wages should be easiest to detect in that sector.

The UW team has more detailed data on the wages of individual workers, although for reasons not clear to me they didn’t do much to exploit the fact that (I think) they can observe what happens to an individual worker over time.  They also restrict the sample to firms which do not have multi-site locations because they cannot determine from their data whether workers at multi-site firms are subject to the minimum wage hike.  It’s not clear to me how the Berkeley team deals with this issue as they face the same problem.  From the discussion on page 8, I think the Berkeley team treats all workers for a given firm as being in Seattle and subject to the minimum wage hike if the firm reports either separate employment data from each of its locations so they can pick out the Seattle locations or if the firm reports that its head office is in Seattle.  This is no less problematic than the UW team’s omission of multi-site firms: both teams get biased results due to this issue, for somewhat different reasons.

The UW team also attempted to isolate workers likely to be affected by the minimum wage hike, but with their more detailed data they were able to go beyond limiting attention to restaurant workers.  They instead limit attention to workers who earn less than $19 per hour. This creates a possible problem: if the minimum wage hikes turn$13 an hour workers into $20 an hour workers, that change will be coded in the UW team’s data as a job destroyed rather than a good job created! However, the team shows that there are essentially no effects of the minimum wage change on employment even at wages even substantially lower than$19 an hour and conclude that this problem is unlikely to generate much bias.  This result can be seen in their Figure 1,

The figure shows the estimated effects on employment of the minimum wage increase to $13 for each wage between$9 and $39. These are not raw changes, but rather changes in employment in Seattle relative to changes in employment in the control cities, estimated using the synthetic control method described above. We see big negative effects on number of jobs below$13 per hour, which just shows the minimum wage is actually effective in the sense that such jobs have been legislated out of existence, and the legislation works.  What we hope to see is the number of jobs above $13 per hour increases in Seattle more than they increase in control cities, for example, that the policy turned$11 an hour jobs into $13 an hour jobs, but that’s not what we see. In particular, it isn’t the case that the UW team has just mistaken the creation of much better (greater than$19 per hour) jobs for destroyed jobs: high-wage jobs increased in Seattle in 2015-2016, but high-wage jobs also increased elsewhere and by about the same amount in cities in which the minimum wage was not increased.  But more low jobs disappeared in Seattle than in cities in which the minimum wage didn’t increase.  This is the essence of the UW team’s findings.

Comparing the estimates.

The most directly comparable estimates in the two studies are of the effects on restaurant workers, presented in Table 9 of the UW study and Table 2 of the Berkeley study.  Both teams estimate that a 10% increase in the minimum wage caused about a 1% increase in the average wage of restaurant workers (those under \$19 per hour in the UW case, and about 2% rather than 1% for the Berkeley team’s estimate on fast-food restaurant workers, although this is a noisy estimate).  This effect is perhaps surprisingly small, and occurs at least in part because most workers earn enough that they are not directly affected by changes in the minimum wage.

The contentious estimates are those on employment.  These estimates have been widely described as in conflict, but they’re actually quite similar, statistically.  The results are not easy to compare directly because the Berkeley team frames their results as answers to the question, “for every 1% the minimum wage increases, by what percent does employment rise or fall?” (that is, the report elasticities of employment to the minimum wage).  The UW team mostly frames results as answers to the question, “by what percent does employment rise or fall in response to the actual minimum wage hikes?” Further, the UW team addresses restaurant workers in isolation only in Table 9 and focuses and results for all workers, whereas the Berkeley team only estimates effects for restaurant workers.

The UW team’s estimates of the the effects of the minimum wage hikes on all restaurant employment are reported in the second-to-last column of Table 9, labeled “All wage levels, Jobs.”  Across time, these estimates range from +3.6% to -1.3%, but in all cases the estimated effect is small relative to the uncertainty of the estimate, that is, none of these estimates are even close to being statistically different from zero.  We can express these estimates as responses to each 10% increase in the minimum wage by dividing 1.6 for the the first three quarters after enforcement of the first minimum wage hike (since it was a 16% hike) and 3.7 for the next three quarters (since it was a 37% hike, see the note for Table 8).  After these adjustments, the UW team’s estimates appear even smaller in magnitude: ballpark them at about zero.

The Berkeley team’s analogous estimates are reported in the bottom panel of their Table 2.  They find similarly small effects: each 10% increase in the minimum wage changes employment from about -0.6% to +0.6%.  These estimates are also quite noisy and consistent with moderate positive or negative effects.  Without the data and a great deal of work, we can’t formally conduct statistical tests on whether these estimates are consistent with each other, but given how small, similar, and noisy the estimates are, it is very implausible that we find that they are statistically distinguishable.

What about the estimates that are somewhat less comparable across the two studies?  Are they in conflict, to the extent that we can compare them? The UW team detects a statistically significant effect on low-wage restaurant workers, who may be a similar group to the Berkeley team’s fast food restaurant workers.  The UW team estimates that the minimum wage hikes reduced low-wage restaurant workers’ employment by up to 13% (Table 9, column 5, 2016 quarter 2).  That is equivalent to estimating that each 10% increase in the minimum wage decreased employment by 3.5%.  These estimates are statistically significantly different from zero.  The Berkeley team, conversely, estimates that each 10% increase in the minimum wage decreases fast food employment by only 0.6% (six-tenths of one percentage point), and this estimate is not statistically significantly different from zero.  Aren’t these estimates in conflict?

No.  To see why, consider this graph showing the two teams estimates and the associated confidence intervals,

The points in the middle of the lines show the teams’ best guesses as to what happens to employment of fast food or low wage restaurant workers’ employment for each 10% increase in the minimum wage.  The lines represent 95% confidence intervals around these guesses.  One interpretation of these intervals is that they contain all of the effect sizes which are statistically indistinguishable from the best guess, so for example, the UW team guesses that employment falls by 3.5% when the minimum wage rises 10%, but they would not say that decreases between about 1% and 6% are statistically different from that guess.  Notice the Berkeley team’s confidence interval heavily overlaps the UW team’s.  That does not mean that we can conclude the estimates are statistically indistinguishable (technically, we might still reject the null that the two estimates are the same if the confidence intervals overlap, particularly if the estimates are positively correlated).  But, even though we can’t actually conduct the test, it seems very unlikely we’d be able to distinguish between the two estimates.  We can also say that neither team would be surprised should a deity reveal that the true effect is actually negative and moderately small, say around -2%.

By analogy, suppose the two teams were trying to determine if a coin is fair.  The Berkeley team flips the coin 100 times and gets 54 heads.  Their best guess is that the coin is slightly more likely to come up heads than tails, but if the coin were actually fair, they’d frequently get at least 54 heads or 54 tails (42% of the time, to be exact), so they conclude that 54 heads is not statistically significantly different than 50 heads, there is “no effect.”  The UW team flips the coin 200 times, so they have somewhat more information, and get 114 heads.  114 heads, it turns out, is statistically different than 100 heads, so the UW team reports that they’ve “found an effect.”  But neither would report that their estimates are statistically different from a probability of heads of, say, 55%, and the two teams’ results are not actually in conflict.

In other words, what they Berkeley team means when they report “no effect” on employment is not that there is no effect on employment (yes, that is confusing).  What they mean, again, is that there is no statistically significant effect on employment, whereas the UW team, using different data and somewhat different statistical methods, finds a statistically significant effect.  But the difference between statistically significant and statistically insignificant is often itself not statistically significant.  These estimates are both consistent with small negative effects on employment in the restaurant sector.  The UW team also reports estimates on other sectors and finds larger negative effects on employment, but there are no analogous estimates in the Berkeley study.

What does all this mean?

Two teams of researchers have presented estimates of the effects of Seattle’s recent minimum wage increases on restaurant workers in Seattle, using similar methods and similar data.  Both teams find that there were small but detectable increases in average wages paid in the restaurant sector.  One team found there were no statistically significant effects on employment, but that result should not be misunderstood as a claim that the study “proves” the effect was actually zero, and the estimates in the two studies are not statistically in conflict in the sense that they are both consistent with small to moderate negative employment effects in the restaurant sector.  We can’t compare results in other sectors because the Berkeley study limits attention to restaurant workers.

1. Regarding your two data-related questions:

First, about the longitudinal aspect of the data used by UW folks: I think people who haven’t worked with UI wage data underestimate how noisy it can be, especially where interstate migration is significant. This problem is even worse for younger workers and low-wage workers.

Second, I believe you are incorrect that the Berkeley folks face the same problem with multisite establishments. They use QCEW data, which allocates the employment and wage numbers to counties based on the Multiple Worksite Report.

1. Thanks for the comments. On the QCEW data: I’ve never worked with it, I’m basing that remark on what the Berkeley paper says about data limitations. Here’s the relevant passage:

” Moreover, some multi-site businesses report payroll and head counts separately for each of their locations, while others consolidate their data and provide information as if their business operated only at a single location. Moreover, the Bureau of Labor Statistics recently began to organize data spatially by geocodes (exact addresses), rather than by zip codes. Postal zip codes do not exactly match city boundaries. In some cities these changes affected both how multi-unit businesses report their results and whether some businesses were located in the city. Our tests find that the statistical noise level in the city-level Seattle QCEW data was very low.”