2018 Early voting analysis

You know you're in a programming town when this gets run on your local TV station;s website:

Full methodology:

Latitude and longitude coordinates for 2018 early voting locations were obtained from the State Board of Elections and Ethics Enforcement's lookup tool by using a Python script.

Coordinates were not available for 2014 through this tool, so the bulk of these locations were generated using the U.S. Census geocoder tool. Addresses that could not be matched were manually researched and recorded using Google's geocoder tool.

In 30 North Carolina counties, there were no changes in early voting locations between 2014 and 2018, so these counties were omitted from the analysis. This left 580 sites for the two midterm elections. Voters in these counties were also omitted from this analysis, leaving 6,433,969 active and inactive voters, both of which are eligible to cast ballots, according to state elections officials.

While some early voting locations may have been relocated due to the impact of hurricanes Florence or Michael, this analysis considered only the original early voting locations approved by local elections board and the state board.

Latitude and longitude coordinates were then matched to active and inactive registered voters on addresses, city, county and ZIP using MySQL database software. The query failed to match the addresses of 145,645 voters, a 97.7 percent match rate.

We used the free application programming interface (API) from the Open Route Service to generate isochrones – polygons for geographic information systems used to determine driving distances radiating outward from a point source. Isochrones were generated programatically using a Python script.

Open Route Service limits queries through its API to 10 shapefiles at a time. The service also limits total API queries to 2,500 a day.

Due to these limitations, the Python script runs queries for each site four times to produce a geojson feature collection with shapefiles at half-mile intervals from 0.5 to 20, with each polygon describing a driving distance range.

For example, a point that appears in the isochrone with a mile value of 5, but not in an isochrone with a mile value of 4.5, is within 4.5 and 5 miles from the early voting location.

Voter registration data, in CSV format, are loaded into the database, and a separate Python script was used to import the isochrone geojsons using ogr2ogr and its pygdaltools wrapper with a Python script.

SQL queries can then generate mile values for each isochrone intersecting each voter, by county. By deduplicating the table based on the voter and keeping the smallest value, we can find the closest site and distance for each voter in 2014 and 2018.

We then used database software to calculate the change in distance from the closest voting location in 2014 and the closest early voting location in 2018 for every active and inactive voter.

Because the driving distances were limited to 20 miles from each voting location, 62,325 voters could not be matched with either a 2014 or 2018 isochrone because they were outside the 20-mile range. This amounts to less than 1 percent of the registered voters in the study for which the difference in driving distance could not be calculated.


One thought on “2018 Early voting analysis

Comments are closed.