The two predictions only differed by the inclusion of a pollster quality index.
|Without quality index|
|With quality index|
Obviously, this is more a stroke of luck than anything. Basically, those models managed to capture the trend (down) of the last few days and the last poll numbers. It's good to know that polls are informative, though (not only in levels, but also in trends).
Here are an incomplete and quick list of things to do for the next elections to improve the use of polling data:
- Change the weights: I added the function that (I think) Nate Silver is using for the poll's closeness to election day (exponential decay with a half-life of 30 days), but I didn't know what to do with the poll size and with the very rough indicator of pollster quality
- On the latter, we need to have a better measure of voter quality. The main problem is that I don't think we have polls region by region, so the data to compute this quality measure will be quite weak.
- Adding the approval rate of the incumbent would probably be a good thing.
- I used the simplest thing I could find in R but there must be better ways to do the estimation given the nature of the data (e.g. the span of the non-linear regression I guess?)
- Other stuff, but you know, I am watching TV