Partition Decoupling Method Reveals Multiple Layers of Legislative Voting Behavior
1. Partition Decoupling
for roll call data
Scott Pauls
Department of Mathematics
Dartmouth College
scott.pauls@dartmouth.edu
5th Annual Conference on Political Networks
June 13-16, 2012
University of Colorado at Boulder
and the University of Denver.
2. Partition Decoupling for roll call
data
This is joint work with Greg Leibon, Dan Rockmore,
and Robert Savell, all from Dartmouth.
http://arxiv.org/abs/1108.2805
5th Conference on Political Networks June
13-16, 2012
3. Inference from roll call data
Aye! Nay!
Aye Aye
5th Conference on Political Networks June
13-16, 2012
4. A legislator is merely a bundle of
votes.
5th Conference on Political Networks June
13-16, 2012
6. Partition Decoupling Method (PDM)
NETWORK
COMMUNITY
LEARNING
LOW
DIMENSIONAL
REPRESENTATION
ITERATION
5th Conference on Political Networks June
13-16, 2012
7. Comparisons
Minority Random Poole- Poole- % of PDM: one PDM: % of
model model Rosenthal: Rosenthal: residual layer two residual
1 dim. 2 dim. captured layer captured
House APRE 0 0.4561 0.534 0.593 13 0.839 0.856 11
Percent 67.3 [72,88] 84.5 86.5 13 94.7 95.3 11
correct
(House)
Senate 0 0.4834 0.476 0.563 17 0.809 0.822 7
APRE
Percent 66.6 [70,90] 82.3 85.2 16 93.6 94.1 8
correct
(Senate)
5th Conference on Political Networks June
13-16, 2012
8. Example: 108th Senate
“Conservative Republicans” Sessions, Kyl, Cornyn, Santorum,
“Moderate Republicans”:
etc.
e.g. Snowe, Chaffee,
Frist, Lott, Brownback, Collins, Specter, etc.
Hagel, etc.
Fitzgerald, Gregg, McCain,
Sununu, Warner
“tax cuts”
Zell Miller (D-GA) “Liberal Democrats”: e.g.
Kennedy, Feingold, Boxer,
Leahy, Reed, etc.
“Conservative Democrats”:
e.g. Pryor, Lincoln, Bayh,
5th Conference on Political Networks June Breaux, Landrieu, etc.
13-16, 2012
9. Example: 88th Senate
Party
Civil Rights
Outer shape: red=midwest, blue=northeast,
green=south, 5th Conference on Political Networks June yellow=west
black=southwest,
13-16, 2012
10. Layer two
Regional identification dominates highest
correlations (particularly in recent years).
Clustering on the residual data provides a new
partition of network which is (often) completely
different than the first layer.
In particular, clusters are not dominated by
party identification.
5th Conference on Political Networks June
13-16, 2012
11. Example: 108th Senate
Three clusters of mixed party.
Four sets of issues distinguish the clusters effectively:
1. Infrastructure: Three amendments (86, 214 and 230) to H.J. Res. 2,
the Appropriations Bill, relating to infrastructure projects.
2. Energy: Seven amendments (515, 843, 844, 851, 853, 856, 884 and
1386) to Senate Bill 14, a bill concerning the energy security of the
United States. One amendment (272) to S. Con. Res. 23, relating to
drilling in the Arctic National Wildlife Refuge.
3. Homeland Security: Two amendments (515 and 3631) pertaining to
Homeland Security.
4. Trade: The passage of the US-Chile Free Trade Agreement
The first and second clusters are well separated by the
Energy votes, the first and third by Energy and
Infrastructure votes and the second and third by one
energy vote, Homeland Security and Trade votes.
5th Conference on Political Networks June
13-16, 2012
12. Interaction of the two layers
5th Conference on Political Networks June
13-16, 2012
13. Interaction of the two layers
5th Conference on Political Networks June
13-16, 2012
14. Conclusions
• PDM decomposition reveals multiple layers of structure
associated to roll call voting.
• Taken together, these form a mathematical description of
ideology.
• The coarse version of the first layer is close to the results of
spatial models but even the first layer significantly
outperforms spatial models with respect to standard metrics.
• The use of multiple layers allows us to capture a more
nuanced picture of ideology while still retaining the parsimony
of the NOMINATE-type models.
• Our dimensionality results confirm those of Poole-Rosenthal
while simultaneously incorporating contradicting evidence
(e.g. Heckman-Snyder) – the dimensions appear at different
scales.
5th Conference on Political Networks June
13-16, 2012
15. Distinguishing clusters: 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal/conservative”).
Y N N N N N N
N N Y Y N N N
Y Y Y Y Y Y N
Y/N Y/N Y/N Y/N N/Y N/Y N/Y
Y Y Y Y Y N N
Y Y Y N N Y N
Democrats Republicans
5th Conference on Political Networks June
13-16, 2012
16. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal /conservative”).
Y N N N N N N
An amendment to
N N Y Y N N N an appropriations
Y Y Y Y Y Y N bill which would
eliminate tax cuts.
Y/N Y/N Y/N Y/N N/Y N/Y N/Y
Y Y Y Y Y N N
Y Y Y N N Y N
Democrats Republicans
5th Conference on Political Networks June
13-16, 2012
17. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal conservative”).
Y N N N N N N
An amendment to
N N Y Y N N N repeal authorities
Y Y Y Y Y Y N and requirements
for a base closure
Y/N Y/N Y/N Y/N N/Y N/Y N/Y
Y Y Y Y Y N N
Y Y Y N N Y N
Democrats Republicans
5th Conference on Political Networks June
13-16, 2012
18. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal/conservative”).
Three votes:
1. Sense of the
Y N N N N N N Congress re: global
AIDS funding
N N Y Y N N N 2. Cloture: Safe,
Accountable,
Y Y Y Y Y Y N Flexible and Efficient
Transportation Act of
Y/N Y/N Y/N Y/N N/Y N/Y N/Y 2004
Y Y Y Y Y N N 3. Amendment to
provide a
Y Y Y N N Y N brownfields
demonstration for
qualified
green/sustainable
Democrats Republicans design projects
5th Conference on Political Networks June
13-16, 2012
19. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal/conservative”).
Two votes:
1. Extend
Y N N N N N N Unemployment
N N Y Y N N N Benefits
Y Y Y Y Y Y N 2. Sense of the
Senate re:
Y/N Y/N Y/N Y/N N/Y N/Y N/Y imposition of an
Y Y Y Y Y N N excise tax on
tobacco
Y Y Y N N Y N
lawyer’s fees
that exceed
Democrats Republicans $20,000/hr
5th Conference on Political Networks June
13-16, 2012
20. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal/conservative”).
Amendment to
Y N N N N N N
protect US workers
N N Y Y N N N from foreign
Y Y Y Y Y Y N competition for
performance of
Y/N Y/N Y/N Y/N N/Y N/Y N/Y Federal and State
Y Y Y Y Y N N contracts.
Y Y Y N N Y N
Democrats Republicans
5th Conference on Political Networks June
13-16, 2012
21. Distinguishing clusters 108th
Senate
• Coarse picture: one dimensional ideology
(“liberal/conservative”).
Y N N N N N N
N N Y Y N N N Amendment to vest
sole jurisdiction
Y Y Y Y Y Y N over Federal budget
Y/N Y/N Y/N Y/N N/Y N/Y N/Y process in the
Y Y Y Y Y N N Committee on the
Budget
Y Y Y N N Y N
Democrats Republicans
5th Conference on Political Networks June
13-16, 2012
22. I suspect that some of the issues raised by the
referees reflect disciplinary differences. For a paper
like this to be attractive to a political science
audience, it is necessary to motivate the method
theoretically. This can be of the type "this model is
based on these behavioral axioms" (like the
Nominate and Clinton-Jackman-Rivers model) or
"this method allows us to answer an important and
previously unanswered question." Either of these
frames would make the paper of greater interest to
political scientists. If the method doesn't provide
stand on its own substantively, then it's not
something that will have much traction in our
discipline, nor, in my mind, should it.
5th Conference on Political Networks June
13-16, 2012
Notes de l'éditeur
Yes, I have a beard now.No, I am substantially taller than the rest of them. They made me sit.
From a sequence of votes, how much can we determine about the legislators themselves? Can we detect party affiliation? ideology? procedure? A standard tool in the political science literature is the family of NOMINATE models developed by Poole and Rosenthal. The basic idea in those models is that a legislator votes by considering the relative positions of themselves and a bill in a Euclidean representation of an issue space. The simplest version is the one dimensional model where the ideal points of the legislators and the bills provide a simple predictive model for voting.This has spurred something of a cottage industry for estimating ideal points – the NOMINATE scheme is based on a maximum likelihood estimation based on roll call data. There are, of course other approaches. For example, Heckman-Snyder use a factor model on the same data, finding that more dimension (~5) are needed to satisfactorily explain the data. Inherent in these methodologies are modeling assumptions, the foremost being the a priori determination of the dimension of the space of ideal points. One of the main motivations for our work is to attempt to understand to what extent these assumptions are warranted.
So what is our goal? We wish to provide an unsupervised method for empirically modeling ideology from roll call data. In particular, we wish to provide a method by which we gain estimates on the dimensionality of the data. This allows us to validate the extent to which the NOMINATE assumption of “less than 3” dimensions is appropriate. To this end, we present the data and our encoding of it. A legislator is considered to be a bundle of votes, no more, no less. A vote is -1 (nay), 1 (yea) or 0 (not present/not voting). Our basic modeling unit is the notion of a motivation – a (real valued) vector representing an ideologically coherent position on the votes.
Various comments are in order. First, the motivations need not be orthogonal or independent (although in practice they are often the latter) – they can overlap substantially. This is not surprising, different ideological positions may have votes of common salience. Second, the weights are quite important and a priori may appear on multiple different scales. Weights at different scales are precisely the type of issue that may make the NOMINATE paradigm fail to the extent that it misses smaller scales that are dominated by larger ones. Third, the residual term is included, in part, to capture the presence of noise in the data – things we have no chance of detecting given the limitations of the data. For example, if a legislator votes a particular way due to a single bribe, quid pro quo, log-rolling, etc., this is undetectable from a statistical point of view. Our algorithm aims to discover both the weights and the motivations.
We use correlation, but other measures (e.g. percentage of votes in common) yield similar results.For our clustering step, we use spectral clustering. Again, other methods (e.g. kmeans) yield similar results.We determine the weights via least squares.This first pass gives us our layer one approximation. This is a dimension reduced version of the data dictated by the motivations. By construction, the motivations will pick out only the structure at the dominant scale. Thus, when we create the residual (i.e. compute 𝜖), we see pieces of smaller scales amplified for further study. In principle, we continue until we cannot distinguish our residual data from noise (our model for this is a randomized version of the roll call data). In practice, we almost always stop after two levels to avoid overfitting. We use correlation, but other measures (e.g. percentage of votes in common) yield similar results.For our clustering step, we use spectral clustering. Again, other methods (e.g. kmeans) yield similar results.We determine the weights via least squares.This first pass gives us our layer one approximation. This is a dimension reduced version of the data dictated by the motivations. By construction, the motivations will pick out only the structure at the dominant scale. Thus, when we create the residual (i.e. compute 𝜖), we see pieces of smaller scales amplified for further study. In principle, we continue until we cannot distinguish our residual data from noise (our model for this is a randomized version of the roll call data). In practice, we almost always stop after two levels to avoid overfitting.
Aggregate Proportional Reduction in Error (APRE) = (Minority Vote – Predicted Errors)/Minority VoteRandom model APREs: 10 randomizations for each congress, APREs are mean of those trials. % correct given in ranges due to random nature of the model.Observations:PDM significantly outperforms NOMINATE. Part of this may be due to dimensionality – typically, for example, layer one has ~5-10 motivations, hence 5-10 “dimensions.” While not directly comparable, this would lead us to compare to a 5-10 dimensional NOMINATE model. While the errors are then more comparable, Poole and Rosenthal indicate that they believe these extra dimensions are just overfitting noise, while the motivations come with ideological descriptors. In other words, the dimensions given by the PDM have derived meaning associated to them and hence can be interpreted, compared to one another, etc. NOMINATE performs just about as well as our random model. Information difference: for n legislators and k votes, PR uses n+k variables for each dimension. Minority model uses k (binary) variables, the random model uses 2*k vars (# yea, #nay for each votes). Ours uses c(n+k) where c is the number of clusters (total).
This example shows the results of the layer one approximation for the 108th Senate. We find six motivations which clearly delineate the two major parties as well as subgroups within them. The only party “cross-over” is Zell Miller (you may remember that, at this time, he endorsed G. W. Bush for reelection over Kerry and spoke at the Republican convention). The embedding given here is a two dimensional spectral embedding – this is derived from the clustering process. In short, it is a reasonably good approximation of the layer one data. This embedding roughly reflects a one dimensional ideological projection similar (and correlated with) NOMINATE scores. Moreover, the motivations come with annotation derived from their representative votes. Due to time constraints, I won’t discuss this in detail (although I have the slides) but the point is that the different clusters are distinguished by appropriate and valid ideological indicators as represented by votes. For example, the “Liberal Democrats” are separated from all the other clusters by their votes on some amendments to an appropriations bill concerning tax cuts.
A quick example of a higher dimensional representation. Here the PDM recovers a truly two dimensional representation for the 88th Senate which is directly in line with NOMINATE. The two axes, as derived from the motivations, are basically indicators of party ID and opinion on a collection of bills related to Civil Rights.
These graphs give dimension estimates from the spectral embeddings (technically, they are dimension estimates using MDS with the traditional cutoff of stress < 0.1). The blue bars are the dimension estimates for the first layer. The red bars are the estimates for the second layer. The black line give the estimates for the combination of the two layers. Observations: The estimates on the first layer confirm the results of Poole-Rosenthal using NOMINATE. One or two dimensions is sufficient for most congresses.The estimates for the second layer show the amount of information being lost. This is consistent with Heckman-Snyder who uncovered, using factor analysis, the necessity of more dimensions. The combination (black) shows how this disparate views may be unified – the secondary dimensions are small in scale when compared to the first.
These graphs give dimension estimates from the spectral embeddings (technically, they are dimension estimates using MDS with the traditional cutoff of stress < 0.1). The blue bars are the dimension estimates for the first layer. The red bars are the estimates for the second layer. The black line give the estimates for the combination of the two layers. Observations: The estimates on the first layer confirm the results of Poole-Rosenthal using NOMINATE. One or two dimensions is sufficient for most congresses.The estimates for the second layer show the amount of information being lost. This is consistent with Heckman-Snyder who uncovered, using factor analysis, the necessity of more dimensions. The combination (black) shows how this disparate views may be unified – the secondary dimensions are small in scale when compared to the first.