Election 2012: Google’s Insight

Seth Stephens-Davidowitz uses Google Insights for Search to analyze voting patterns and turnout, and to predict the winner.

Seth Stephens-Davidowitz doesn’t need to read polls to know who’s going to prevail in today’s hotly contested presidential election. Just analyze millions of Google searches for terms like “where to vote,” “Obama,” or “Paul Ryan shirtless”—an actual search—he says, and you’ll have your winner.

As a self-described “lost graduate student” who is working on a doctorate in economics, Stephens-Davidowitz has been studying the election through the lens of an online tool called Google Insights for Search, a statistical treasure chest of anonymous, comprehensive data from millions of Google searches in hundreds of media markets in the United States. Such data, he says, could prove to be more reliable than standard prediction polls and surveys.

“It was just like the clouds parted and I realized it was the coolest thing I have ever seen in my life—I’ve spent every day for the last year and a half analyzing Google searches,” he explains. “It’s a very powerful tool to get beyond this idea of ‘social desirability bias,’” in which “people tend to tell surveys what they think makes them look good instead of what they actually believe.”

Harvard Magazine talked to Stephens-Davidowitz this morning for more insights into today’s election.

HM: According to your latest review of Google Insights, who is going to win today?

SSD: My methodology is a little unconventional and new, but I’m seeing a Gary Johnson victory [laughs]. Kidding aside, I don’t think that there is anything in the data that is telling me anything different from what others are saying—that Obama should win. There are a couple of states [where] I think Obama might do a little bit worse than what the polls are saying, such as Colorado and Pennsylvania, where Democratic turnout might be lower than what the polls are saying, based on my search analysis.

HM: How do you come to that conclusion? How many millions of people or searches are in your study?

SSD: What Google data is really useful for is to help predict turnout. People exaggerate their tendency to vote, but by looking at searches such as “where to vote,” “how to vote,” “voting locations,” “voting 2012”—and by seeing whether those searches are high or low compared to the same period four years earlier—you can get a prediction on whether turnout is going to be high or low. In a couple of states where turnouts are expected to be low and the searches are particularly low in Democratic strongholds, that could be a bad sign for Obama. And there are tens of millions of people doing these types of searches. It is a very common search prior to the election.

HM: According to the data, what are the issues voters care most about? 

SSD: One of the things you see in the data is perhaps a little [less] focus on the issues and a little [more] focus on superficial facts than voters might tell pollsters. The issues are very state-specific on what issues people care about: some states are very concerned with Romney’s abortion position; other states are more [focused on] the economy or gas prices or things like that. In general, people search a lot less on the issues than they let on to pollsters. It’s amazing how often people search for things like “Paul Ryan shirtless.” The recent positive comments Governor Christie [made] about Obama have gotten a lot of attention; many people falsely think he endorsed Obama. I think Romney is probably upset about that whole situation!

HM: You said race mattered in 2008. Has the data shown the same this year? Will it hurt Obama?

SSD: It definitely will hurt Obama, although it could hurt a little less than four years ago. Racist searches have been down overall a little bit. In 2008 we saw a big surge of “Obama Muslim” searches right before the election and we haven’t seen that much this time around, so it might be a little bit less of an issue this time. I think a lot of people these last couple of days have been looking up a lot of stuff on Mormons, so that could be an interesting last-minute factor. Many people don’t know that Romney is Mormon and they don’t know what Mormon positions are, so you do see in many evangelical areas people questioning what Mormons believe and whether they should support a Mormon. The theological factor could definitely be something that plays a role today.

HM: What does Google Insights tell you about voter turnout today? Will it be greater or less than polls have predicted? 

SSD: It’s really a state-by-state issue. In some states it could be a lot higher—Mormon turnout is going to be a lot higher. I don’t think polls realize how energized the Mormon base is for Romney, but that is really going to help him most in non-swing states like Utah and Idaho, and maybe half a percentage point in Colorado and a tiny bit in Nevada.

HM: What are you going to be doing today? Will you be analyzing data?

SSD: I will, I always do. I thought that by now I would have thought of one search or combination of searches that would really be the predictor of how many votes Obama or Romney is going to get in each state, and I’m not confident enough that I have found one, so I am just going to focus on voter turnout. The good thing is that the more elections you have, the more you can compare the searches with what actually happened. The disadvantage is, we only have two elections where we have Google data; it’s very hard to know what actually has predictive power and what is just a fluke. After the vote totals come in, we can look back on some of these things and get a lot more information. There are many subtle things to look for—I’ve noticed if you just compare the ratio of searches where Obama comes before Romney, compared to Romney coming before Obama, that seems to have a lot of predictive power over whether a state is going to go to Obama or Romney. Things like that, though, I really need to confirm after the vote total comes in.  

HM: How will people be using this tool in 2016?

SSD: It’s going to be huge—it’s only going to get bigger and more important…If anything, the percentage of people using Google is going to get higher and it will make the data that much more powerful. Surveys are getting worse and worse just as Google data are getting better and better. There will be a point where the information in Google is more powerful than what a survey can tell us, and that is coming soon. It will be a huge part of political and social analysis of all types.

HM: Is it just Google Insights? What about Twitter or Facebook? Is social media something to be looking at in the future? 

SSD: [Twitter and Facebook] don’t have the anonymity; that is really Google’s advantage. I can imagine in four years the best predictor could be a combination of Facebook likes, Tweets, YouTube views, and Google data—you can put everything in there. But the fact that Google is anonymous makes people much more honest—and that is a huge advantage. 

You might also like

A New Chapter for Harvard Arts

The Office for the Arts turns 50, and its longtime director steps down.

Education School Announces Interim Dean

Nonie Lesaux will serve as dean during search

Harvard Students form Pro-Palestine Encampment

Protesters set up camp in Harvard Yard

Most popular

Marc Hauser “Engaged in Research Misconduct”

Federal investigative agency reports on former Harvard psychology professor’s work

The Homelessness Public Health Crisis

Homelessness has surged in the United States, with devastating effects on the public health system.

Claudine Gay in First Post-Presidency Appearance

At Morning Prayers, speaks of resilience and the unknown

More to explore

What is the Best Breakfast and Lunch in Harvard Square?

The cafés and restaurants of Harvard Square sure to impress for breakfast and lunch.

How Homelessness is a Public Health Crisis

Homelessness has surged in the United States, with devastating effects on the public health system.

Portfolio Diet May Reduce Long-Term Risk of Heart Disease and Stroke, Harvard Researchers Find

A little-known diet improves cardiovascular health through several distinct mechanisms.