Search

It’s Time for a US National Election Results Archive

The MIT Election Data and Science Lab helps highlight new research and interesting ideas in election science, and is a proud co-sponsor of the Election Sciences, Reform, & Administration Conference (ESRA).

Our post today was written by Stephanie Singer, based on her presentation at the 2021 ESRA Conference. The information and opinions expressed in this column represent her own research, and do not necessarily represent the opinions of the MIT Election Lab or MIT.


Pity the researcher trying to study United States election results. Paper after paper in this field make clear that a significant part of the research is simply the assembly of the relevant election results. This painstaking—and error-prone—task is hardly the highest and best use of the researcher’s time.

Anyone can go to elections.ca for detailed results data for Canadian federal elections in a single consistent format. No such official resource exists for American federal election results data. Consolidation of United States election data is hampered by the freedom that jurisdictions have to publish in whatever format they choose.

The United States federal government regularly funds consolidation of some election-related data, including the Election Administration and Voting Survey (which includes registration and turnout statistics, but not candidate vote totals) and the American National Election Study which, despite its charge to “[provide] gold standard data on voting, public opinion, and political participation in American national elections,” is limited to public opinion surveys.

In the absence of government action, the American entrepreneurial spirit has given us some partial archives assembled by academics (e.g.,the MEDSL collections or the United States Elections Project) or archives behind paywalls (such as at CQPress). But with advances in data science and a growing acceptance within government walls of the importance of publishing data, a much better archive of United States election results is within our reach. 

No archive is necessary to achieve the primary purpose of election results: to determine the winners, it’s enough that each state, territory and district publish the results. But election results are essential to understanding elections. Political operatives, political scientists and election administrators need good data to analyze elections, plan future campaigns and continuously improve operations. 

What characteristics would a great American election results archive have? Data could:

  • be openly available at no cost.
  • come from primary sources.
  • be available in a variety of consistent, useful formats.
  • include all US jurisdictions, including the District of Columbia and the five territories.
  • include all elections, and all contests, including federal, state and local
  • be available at all geographic granularities, including precinct-level,  state-level and various levels in between, such as county, state house district or township.
  • be available by voting method.
  • be confirmed with rigorous data quality testing.
  • be available in a timely manner—preliminary results shortly after they are published, and official results shortly after they are certified.

Software will be a key component of and maintaining an American election results archive, to reduce the amount of work by hand. With funding from the National Science Foundation, we have built software supporting timely, less-labor-intensive consolidation of election results.  The software does not entirely eliminate work by hand – some copying and pasting from intransigent formats like pdf is still required – but results available in tabular text formats, excel, xml or json can be automatically consolidated into a single database. A human eye is also required to specify the organization of the data (are the candidate names in the rows, or in the column headers?), but this specification can be done in advance of the election, making the post-election processing automatic.

In addition, the software provides tools for analysis, visualization, and export:

  • analyses at a national scale, such as finding one-county outliers in multi-county contests
  • visualizations such scatter plots comparing vote counts by candidate, vote type or party to each other or to external data such as census data
  • exports in tabular, xml or json formats, including the National Institute of Standards and Technology (NIST) common data format for election results reporting.

The software has already supported:

The software adheres to the best practices promoted by the Linux Foundation Core Infrastructure Initiative, including versioning. There is a detailed Users Guide.

And the software has an open source license so that anyone may use or improve it.

We now have several building blocks for a high-quality, comprehensive, public national archive of election results: widespread publication of results in electronic form, software support for consolidation of those results, NIST common data formats. Next steps include:

  • working with election agencies and election system vendors to encourage publication in standard formats
  • finding sustainable funding.

If we can find a way to make such an archive a reality, analysts will have more time for analysis, and research analyses will be easy to repeat, election after election, or to apply retroactively. And the more data we consolidate, the more sophisticated algorithms we can apply. Who knows what we might discover?

Stephanie Singer combines the scientific method with extensive practical election experience to implement and advocate for high-impact changes in election practices. For more on her professional background in election administration, academia and data science, see www.campaignscientific.com.

More

Back to Main