Joint with Leticia Abad
The 2020 election was strange, and not just because the President denied he had lost, was believed by millions, and inspired a violent plot to use a mob to seize control of Congress and overturn the election.
No, it would have been strange even without all that. Georgia and Arizona went blue. Florida went red. Donald Trump picked up nonwhite votes — open bigotry notwithstanding — and upscale suburbanites voted for what may be the most left-wing Democratic platform since 1976. (Or maybe 1948.)
So what happened? Well, we have been gathering county-level electoral, Covid, and BLM-related data for another project. Numbers in hand, we decided to take a look at the 2020 election.
The first thing we did was to look at some simple correlations. What county-level characteristics were related to a bigger swing towards Donald Trump between 2016 and 2020?
Figure 1 Simple correlations with the 2020 swing towards the GOP, by county
The above figure shows our results. Negative numbers represent swings towards the Democratic candidate. Dark shading means that the correlation is tight and significant at the 95 percent level; moderate shading means it’s significant at 90%, and very light shading means that it is not significant. The ethnic markers refer to self-identified ethnicity and not foreign-origin; in other words, in all cases they represent mostly citizens (save for Central Americans).
These results, of course, don’t mean much. Several of those variables are highly correlated with each other. For example, did Mexican-American counties swing towards Trump because they are Mexican-American or because Mexican-Americans are relatively young? Moreover, they don’t all for geographic variation. Did Puerto Ricans move away from Trump because they are Puerto Rican or because lots of them live in New York and New Jersey? So all the above numbers should be taken with a grain of salt, both the ones that jibe with the conventional wisdom and the ones which don’t.
To start the long process of teasing out which variables mattered, we ran a simple OLS “horserace” regression. And here is where things get interesting!
Figure 2
Let’s take it from the top. The above figure shows the coefficient (i.e., the correlation controlling for other things) for various variables. It also shows the error bars. If the error bar crosses the zero line, then there is a good chance that the effect you see is really the product of random chance. Positive numbers mean a swing towards President Trump; negative numbers mean a swing towards Joe Biden relative to Hillary Clinton in 2016.
Conceptually, these regressions compare different counties within a state. That is, swings in one way or another that uniformly affect every county and voting group within a state are already accounted for. (In technobabble, we use state fixed effects and cluster standard errors at the state level.) These numbers capture the average swing for a group across the country relative to the statewide swing where they live.
So what do we get? First, a pair of mysteries! One, Covid deaths do not seem to have had much effect. Two, BLM protests in a county are associated with a small (but statistically significant) swing towards the Democrats … and if anything, the swing was greater if the protests were violent. Admittedly, this result is neither surprising nor free from endogeneity.* But it is different from what Omar Wasow found when he analyzed the effect of the 1960s-era protests and riots.
The Latino swing towards the GOP shows up fairly clearly. The higher the share of Mexican-Americans, Cubans, or Dominicans in a county, the bigger the swing towards the GOP. Asian-Americans show the same pattern. On the other hand, Blacks and Puerto Ricans swung fairly clearly towards the Democrats: reports from the exit polls of Trumpian inroads among African-Americans do not appear to be borne out in these data.**
Finally, an unexpected result, which you can’t quite see in this data. A lot of ink has been spilled on the rural-urban divide in America. And it’s real! But it’s not really cultural. From our data, large central metro counties swung about three percentage points more Democratic in 2020 than the rest of their state, ceteris paribus. But everyone else — exurban counties, small metro areas, “micropolitan” counties, and rural places — was about the same. That is to say, when you account for demographic differences, there isn’t much of a rural/urban split in this country. There’s just a small giant-center-city/everyone-else split.
But there’s a problem with these results! The regressions treat every county equally. Alpine County, California, (population 1,129) is treated the same as Los Angeles (population 9,969,510). It could very well be that small shifts in tiny counties are driving our results. So we weighted our regressions by population, and report both results below.
Figure 3
So what changes? First, our coefficients on covid deaths, the share of college graduates, and BLM protests doesn’t change. Neither does the African-American swing towards the Democrats.
Second, the effect of unemployment goes away. That was driven by small counties. That effect disappears when you give the big cities their proper weight. (Yes, that is a little surprising.)
Third, the peculiarities of Mexican-Americans, Puerto Ricans, and Dominican-Americans also go away. It turns out that the Mexican and Dominican swings towards the GOP were driven by small tiny counties. (We identified a few counties with precisely one Dominican resident.) Same thing with the Puerto Rican swing towards the Democrats. When you give the populous counties their due weight, the effects all disappear. Only the Cuban swing holds up as a real thing that happened.
This isn’t to say that it’s a bad idea for Democrats to try to appeal to the Mexican-American voters in the Rio Grande Valley who broke hard for the Republican incumbent. It’s never a bad idea to try to appeal to any constituency! But they seem to have been a peculiar group and we’re not sure that the putative Latino swing to the GOP is really worth bothering about, since we’re not sure that it happened. (Cubans, that’s a different story. They really did swing towards Trump, but the reasons are not that mysterious. Even you did not expect the GOP’s demonization of socialism to work as well as it did — and we did not — it should not come a massive surprise that it did work.)
Finally, it does indeed look as though Asian-Americans in the big cities and suburbs swung back towards the GOP. We can try to break that out by ethnicity, and we will! But that’s the real swing in our data, the one that nobody seems to be talking much about, maybe because it’s more fun to blame things on Mexicans.
To end with a plea: any thoughts on research designs designed to tease out the reasons behind the counterintuitive results on Covid deaths and BLM-related unrest?
* Yes, yes, “endogeneity” is technobabble. Sue us.
** A problem with Central Americans is that most are not citizens. If you include a variable for the share of noncitizens in a county, then the coefficient on Central Americans disappears. Since non-citizens do not vote, it is possible then that the effect is actually that citizens in counties with lots of non-citizens tend to be related to those aforementioned non-citizens, and people with non-citizen relatives swung towards the Democrats. But because the share of non-citizens and the share of Central Americans tend to be correlated, it’s hard to disentangle what’s going on.
Fascinating. The thought I had was that those variables were likely to be correlated with the baseline (2016 vote share, yeah)? Maybe 2sls covid-19 deaths, BLM, etc. on baseline vote share. All else equal the absolute change in vote share could be smaller in counties that had very high D or R share in 2016?
Posted by: YFNEconomist | February 03, 2021 at 08:37 AM
Thanks for the encouragement. Endogeneity is the bain of our social science existence, for sure! Your instrument is a good idea! Now, we think we have --at least-- two endogenous regressors: Covid mortality and BLM protests. So we can only play with one variable at a time. And we did! Bad first stage for Covid, good first stage for BLM. What to do? And do you have in mind the mechanisms/rationale? Exclusion restriction holds? Enquiring minds want to know!
Posted by: Leticia Abad | February 03, 2021 at 10:37 PM
Welp, exclusion restrictions probably don't hold, so maybe not the best instrument after all. So maybe use relative change, rather than absolute for the dependent variable?
I also wonder how the fact that you have not a sample, but a population (unless I'm wrong: do you have all the counties?), should affect your interpretation of significance.
Posted by: YFNEconomist | February 04, 2021 at 07:23 AM