Tuesday, December 6, 2011

Race and Pardons: Some of Our Concerns

By now, the well-known headline is that race matters when it comes to applying for a presidential pardon. Or, at least it did, in the administration of George W. Bush! The proposition emerges from a statistical analysis of 494 pardon applications which were acted upon (granted or denied) by Bush. The research was done by ProPublica and readers can read ProPublica's summary of the data here. Some have expressed "surprise" at the findings, although it is our sense that racial disparities are all-too commonplace in research on the criminal justice system. It some contexts, they have almost reached the level of "assumption" (e.g. disparities in sentencing, death penalty).

Nonetheless, we feel comfortable accepting the findings of this self-proclaimed "first systematic analysis of pardons" (all apologies to Posner and Landes, Humbert, Whitford and Ochs, Erler and the rest) with great caution, at least until some additional information is known. Indeed, the analysis raises several obvious questions (see ProPublica's abbreviated discussion of the data here) which cannot be ignored for long.

For starters, this is clearly an exercise in barefoot empiricism. If you are looking for theoretical insight, look elsewhere. Which is fine, in some circumstances, to a degree. But we are not so certain that the literature on criminal processes and executive clemency (yes, there really is one!) is so sparse, and immature, as to justify too much unguided exploration via flashlight. As demonstrated below, the model pays little or no heed to what has already been discovered / is known regarding the pardon process. And the result is a non-parsimonious statistical model that features almost a dozen variables that are not statistically significant, at traditional levels.

Despite discussion in the text of numerous articles covering this study - in the multivariate analysis - the odds ratio statistics for the bankruptcy and marriage variables are actually insignificant, while the odds ratio for probation/prison is significant. The data on the age of applicants (although apparently collected by researchers) were not actually used in the analysis (see "UPDATE" below). But, while one is searching through social background characteristics, why not control for education levels? Indeed, clemency petitions frequently emphasize the attainment of this or that degree post-conviction. It is certainly possible that such data were not available, but this omission should not go unnoticed.

As the basic assumptions of linear regression are not applicable in logistic regression, the display of a pseudo R-square in the multivariate analysis (while fun) is not, without more, particularly enlightening. In addition, seeing how the data in both the sample and the population are heavily skewed (91 percent of the applications being denied), a more interesting (pertinent) thing to know would be the proportion in reduction of error. PRE statistics tell us how much the model improves our ability to classify outcomes (denials or grants) over mere guessing. And, again, if we guess every application will be denied, we will be right 91 percent of the time. So - at a minimum - the provision of a classification table (standard output in logistic analysis) would be helpful (see "UPDATE" section below).

Given all of what we now know about the timing of pardons (which is far from accidental), it seems peculiar that the analysis does not control for when applications were filed in the Department of Justice, or when they were presented to White House. There are no controls for the applications filed or presented in the fourth and final year of the term, the time when we know most presidents have granted the largest number of pardons. A counter variable could have easily tested the relationship between grants and length of term. One out of every two pardons for the last 39 years has been granted in the month of December. The published work of Posner and Landes give attention to these concerns, yet this analysis appears blind on this point.

On a related point, the U.S. Attorney was, somewhat dramatically, replaced in the Bush administration but there doesn't seem to have been any interest in testing the potential impact ("shock") of the administrative disruption caused by that event in the model.

One aspect of the data that has received particular attention in the press is "congressional support." But - despite the vast literature on party capability in the legal process - the analysis does not appear to take into account legal representation (or lack of representation) in the application process (a factor which might very well be correlated with the race of the applicant).

Even then, as we understand it, "congressional support" is operationalized in the study as 1) forwarding an application or 2) supporting it. But these strike us as two very different things. Furthermore, having looked over some of the correspondence at ProPublica, we are not convinced that there is a clear line between the two. Consequently, we wonder what percentage of the almost 200 instances of "congressional support" were categorized by ProPublica as mere forwarding exercises, and what percentage clearly involved something more like traditional advocacy (information not reported by ProPublica)? See commentary in "UPDATE" below. And, of course, we wonder what were the differences in outcomes for each category, if any (again, not reported)? Our concern, of course, is that advocacy (in the normal sense of the language) did not have any particular impact and, as a result, its impact - at least arguably - is now being much overstated by the media. In addition, we wonder if contacting a member of Congress might not actually be a surrogate measure for several other factors (education, race, quality of the application, strength of the argument in the application, etc.) which might have an impact on the outcome of applications.

ProPublica reports that it could determine the race of only 54 (27 percent) of the 200 applicants with "congressional support." And yet it has the audacity to report "Congressional influence did not account for the racial disparity" in outcomes. It would seem this is a conclusion as of yet untested.

It appears the data in this analysis do not control for whether or not the member of Congress is in the House or the Senate, or even the party identification of each member (how counter-intuitive!). As we understand it, it is ProPublica's position that, since applications are sent to the OPA, officials in the Department of Justice and the Executive Office of the White House are simply unaware of this kind of information. First, it should be noted that ProPublica explicitly rejects the same logic with respect to race (even the White House claims it is unaware of the race of applicants). Second, by this theory, George Bush would have been completely oblivious to the fact that a Republican Senator from Texas was strongly in support of an application. By this theory, it is also a complete coincidence that 8 of the top 10 supposedly clemency-supporting congressmen during the Bush administration were Republican. We respectfully disagree with this line of thought. Similarly, it is well known that President Bush granted the largest number pardons to applicants from Texas (currently, Illinois leads with President Obama). But such considerations appear to have been ignored in this analysis as well. Of course, one also wonders about a possible interaction effect between such variables. In sum, we wonder what the impact of race would be (if any) in a multivariate model that is a little more smartly specified.

For the most part, these nuts and bolts questions leave aside numerous theoretical concerns which, we suspect, will be attended to, once the initial rush of sensational headlines passes. There might, for example, be a very significant self-selection process in operation when it comes to contacting members of Congress. Consequently, the "congressional support" variable may very well say more about the kinds of people who contact members of Congress (as well as the overall quality of their applications) than it does about the impact of the mere use of Congressional stationery. Similarly, before we assert a direct relationship between race and outcomes, it seems practical to also explore (or at least think about) indirect effects. In due course.

UPDATE: 12/9/2011, This afternoon two writers for the ProPublica piece were kind enough to call me to provide me their responses to some of the concerns expressed above. They were also open to fielding some questions. As I understand it, they are not willing - at least currently - to release even a classification results table (in order to assess goodness of fit, or the proportion reduction in error). Nor are they willing to share - at least currently - a correlation matrix for the final model. While this would all be somewhat peculiar behavior for an author in my own discipline (political science), I understand that the world of journalism is (and is certainly allowed to be) very different. On the other hand, there is no small irony to such hopefully unnecessary stone-walling (see second paragraph here).

I should also note the neither of the persons that I spoke with explicitly claimed to be responsible for running the statistical analysis, or claimed to have any degree of expertise with logistic regression. ProPublica reports (here) that the Department of Justice is "reviewing" its statistical analysis, but I cannot imagine that it is much of a review without such basic information as descriptive statistics, a classification results table or a correlation matrix (all standard output in programs that perform logistic regression

I also asked about the distribution of applications which were merely forwarded by members of Congress v. those which involved substantive support. I was told those data were not "at hand," but that I was tersely welcomed to wade through copies of letters on the ProPublica web page and arrive at a determination of the figures (and, I suppose, all decision making rules) myself

Finally, it was made clear to me that ProPublica, however annoyed with me, was very little concerned about any of the issues that I have raised because, to date, I appear to be the only person who has such concerns. It was furthermore noted that, when the DOJ performed its own recent review of clemency in the Bush administration, I was not consulted. I took all of that to mean exactly what I think it was intended to mean, however awkward its delivery, but had to chuckle as it was ProPublica who called me! and asked me to share information freely! More importantly, and more personally satisfying, I say to readers who know me, and those who are familiar with the history and impact of this blog, one word  ... "respite." :-) Thus, the little blog that can marches on, cheerfully!


Anonymous said...

These are some good criticisms. I share some similar concerns and would like to see some alternative models. Hopefully, they'll be willing to share the data for replication and extensions in the future.

P.S. Ruckman, Jr. said...

Editor:Yes, maybe they will in the future, and maybe there will be no problems associated with it. For now, for whatever reason, they have a pretty sorry attitude about it. Best,

blogger templates | Make Money Online