US State Legislature Compositions

Published: 2016-09-05

By: MJ Rossetti

Category:
Technologies:

Repository

Project Page

Background

The National Council of State Legislatures (NCSL) provides information on its website about the gender and party compositions of state legislatures from 2009 to 2016.

The source data was available in PDF and HTML formats, so it required additional processing before becoming usable.

Data Engineering

Seeking machine-readable versions of the NCSL data, I found an open source GitHub repository, but it hadn’t been updated in a few years. Happy to contribute to an open source project, I sought to update the repository with the most recent years’ data.

Through a Twitter conversation with the repository owner, I learned the existing conversion process involved software called Tabula. When I used Tabula to convert the most recent two years of party composition data, I ran into issues.

In response to these pdf-conversion issues, the owner suggested the pdftotext command line utility, which ultimately produced adequate TXT files.

After writing a Ruby script leveraging the pdftotext library to convert the PDF files to TXT format, I wrote scripts to convert the TXT files to CSV and JSON formats. I then wrote scripts to convert the gender composition HTML tables to CSV and JSON formats.

After initial satisfaction with the conversion results, I introduced validation checks into the process. Theses validations uncovered a few errors in the source data, which I communicated to NCSL via Twitter and remediated by updating the conversion scripts.

When satisfied with the validation effort, I submitted a pull request to add the most recent data and the automated conversion scripts to the original repository.

Software Engineering

After producing a full compliment of machine-readable data, I created an interactive dashboard to consume the data and aid in exploration.

Data Analysis

The sections below contain findings from my analysis of 2016 NCSL Legislature Composition data.

Legislature Sizes

Despite its small size, New Hampshire’s legislature is the largest (424 seats), while the 13-seat DC City Council is the smallest.

a greyscale choropleth map

Legislature Gender Compositions

Colorado and Vermont legislatures have the highest concentration of females (each over 40%).

a pink-scale choropleth map

Legislatures from Wyoming, Oklahoma, South Carolina, West Virginia, Alabama, and Mississippi have the highest concentration of males (each over 85%).

a blue-green-scale choropleth map

Legislature Partisan Compositions

Nebraska’s Legislature is nonpartisan.

The state legislatures with the greatest concentration of Democrat Party members are Hawaii, District of Columbia, and Rhode Island.

a blue-scale choropleth map

The state legislatures with the greatest concentration of Republican Party members are Wyoming, Utah, and South Dakota.

a red-scale choropleth map