| User Blox 4 |
|
- Put stuff here
|
Barack Obama  |
|
|
|
|
|
|
|
|
|
|
Rob Kailey is a working schmuck with no ties or affiliations to any governmental or political organizations, save those of sympathy.
|
|
Sat Jun 19, 2010 at 12:59:01 PM MST
|
| I realize that I'm a little bit late to the party now, but I was really busy in the last few days- I'm working with two other guys on an algorithm to rate pollsters and accurately project races based on polls- that will most likely be prominently featured at Pollster.com and rival Nate Silver's ratings over at 538.com.
Anyway, today I read the last couple of posts over at Flathead Memo, and saw the graph posted by James Conner under the headline 'Where Gopher won big, Gernant lost big'.
But is that true? Well, obviously it is, but it doesn't seem to be a huge surprise. When four candidates compete for 100% of the vote, it's no surprise that one candidate does badly where another candidate does well. The bigger question is, were the candidates disproportionately hurt in the counties where Melinda Gopher did well, or could James just as well have written a post titled 'Where Gopher won big, Gernant, McDonald and Rankin lost big'?
I'm not really interested in the question if Gopher hurt Rankin, but we can look at her influence on the Gernant/McDonald race by introducing the new metric 'Gernant TwoWay vote'- which is eliminating Gopher and Rankin votes and is defined as Gernant%*100/(Gernant%+McDonald%).
If we build a scatterplot for Melinda% and Gernant TwoWay vote, it looks like this:
There doesn't seem to be a particularly good relationship, and indeed, if we regress Gopher's vote share (and a constant) on Tyler TwoWay, it is insignificant (p-value .29).
That means that Gernant didn't lose particularly badly compared to McDonald where Gopher did well.
We can perform the same analysis for Sam Rankin of course, here's the scatterplot:
And here there IS some weak relation between the variables, with Gernant doing slightly worse where Rankin was doing well (upper left corner).
It's not too strong either though, the p value is .11 (which means that there is the chance of this occuring by chance is 11%- 5% is considered good enough to draw some conclusions by most statisticians, 11% is not). The coefficient of Rankin% is -.41, which basically means that for each percentage point Rankin got, Tyler lost .4% to Dennis McDonald in the two-way vote.
Some more analysis suggests why that may have been the case: Dennis McDonald was doing very badly in places where a lot of people identify their ancestry as 'American'- Gernant was doing okay with them, Rankin performed very well there. That suggests that if Rankin had not been in the race, those people would probably have gone for Gernant over McDonald. BTW, 'American' (or, as we could also dub it, white trash) is the only variable of even some predictive value for the vote share of Sam Rankin- he did not well with them, McDonald in turn did. But apart from that I'm at a loss as to why Rankin did as well as he did- and that almost everywhere, from Missoula over Big Horn to Garfield.
'American' identifiers, for what it's worth, tend to be disproportionately Republican, White and without health insurance.
They just aren't a huge enough voting group in Democratic primaries (in Montana at least) to swing the election alone though, which means that Gernant would most likely have lost in any case- with Gopher and Rankin in the race or without them. He might have picked up more of their votes than McDonald would have done, but not by such a prohibitive margin that he could have closed in- a two-way election might have resulted in a 56-44 McDonald win.
Anyway, we're not only interested in the influence of Gopher and Rankin on the race, we're also interested in which demographic groups Gernant and McDonald had their strengths.
|
| Twohundertseventy :: MT Dem Primary Analysis Post #2- The influence of Gopher and Rankin and everything else. |
For that I worked out two regression models.
The first one is pretty easy to read, and it already does a pretty good job of explaining everything.
The basic formula is
TylerTwoWay= 2.75+.9*Kerry+-.7*PartyID+-1.3*BaseDem+17.5*BaseDem2.
Basically we can split this in two parts:
First, good performance by Kerry worked in his favor, positive PartyID (=Democratic) against him.
What do we make of this? Well, when we look at the sum of those variables, then this is really about liberal/conservative counties. In conservative counties, Kerry underperformed party identification (mind you, except for Kerry's election results those are all estimates), in liberal counties, he overperformed them. There are other factors as well, but basically this means that Tyler did well in liberal, densely populated areas and not so well.. everywhere else. Native American population also shows up as a negative factor in the difference between those two variables (Kerry+PartyID).
The other two variables show a geographical divide: Tyler overperformed by 17.5 points in Missoula (BaseDem2: Missoula County has the value 1, all other counties 0. That's called a 'dummy variable'.), and he did worse when a county was far away from Missoula (BaseDem1: distance from Missoula, primitively measured with an online ruler and an online map).
This model is doing pretty well. I left out the first 20 counties when I was estimating the model parameters, so that I can look at how explanatory the model really is, and it does pretty well:
We can also look at another model that is a lot better in statistical terms, but also hard to interpret: First, it uses more variables, second, it's heteroskedasticity-corrected, which means that it allows for the fact that the variables can have different variances at various points, so that things like this aren't a problem:
The formula for the second model is:
8272+103*Kerry+30*Black+-22*Latino+-.059*PCI+.058*Median Household Income+47.5*Evangelical+.85*LDS+-157.7*Manufacturing (Jobs in the M. sector, %)+84.55*Obama+-5.9*Natives+29*Clinton_Primary+-102.8*PartyID+-80*Independents+-.75*RankinVote.
Make of this whatever you want, I think it's too complicated to be of great value for political analysis. Note, however, that it corroborates the other model by also including the Kerry/Obama as positives/PartyID as negative-divide. Also, Rankin shows up as a significant negative factor this time- it's not enough to be significant as a stand-alone variable, but I'm almost certain that Rankin hurt Gernant.
Anyway, I just wanted to throw it out here because it's really good. Here are the forecasts for all counties (which means that it considers the data of all counties up to the one that it's making a forecast for, in alphabetical order):
There are a couple of outliers, for example no model yet was able to explain how Tyler got only 4% of the vote in Meagher County, but overall it does very well.
But as far as I'm concerned, that's how I'd analyze the election- summarily: Melinda Gopher and Sam Rankin did pretty well across the board, suggesting that they were more a protest choice than anything else, Tyler Gernant was hurt by early voting and to some lesser extent by Sam Rankin, but looking at how he only carried Missoula and Flathead Counties, it's hard to imagine how he could have won under any scenario. Dennis McDonald won across the board, obviously, and had a Labor/Conservative voter coalition that helped him expand his margin of victory. Still, at 42% of the four-way vote he has to do a lot of uniting before he can concentrate on taking on Rehberg. |
|
| Poll |
| Voting. Useful or not? |
|
|
|
Results
|
|