AI and Algorithms in Criminal Sentencing: VJOLT and VJCL Joint Symposium


Michael Schmid ‘21
Ousted Managing Editor

Rachel Martin ‘23
Columns Editor

On April 2, 2021, the Virginia Journal of Law and Technology (VJOLT) and Virginia Journal of Criminal Law (VJCL) hosted a joint symposium on the use of AI and algorithms in criminal sentencing. The discussion was moderated by the Honorable Jed S. Rakoff, Senior District Judge for the Southern District of New York.[1] The panelists were Professor Deborah Hellman of the Law School; Professor Jessica Eaglin of Indiana University Maurer Law School; Julia Dressel, software engineer at Recidiviz; and Alex Chohlas-Wood, executive director of the Stanford Computational Policy Lab.

Technology has revolutionized many fields, and some say it can also revolutionize  our criminal justice system. Arguably, it already has: many jurisdictions have used algorithm-based risk assessment tools for years to determine who gets out on bail and how long people are sentenced to jail. The basic idea behind these risk assessment tools is to utilize data about a defendant to estimate the likelihood they will recidivate. Factors like past criminal convictions, employment history, and gender are given different weights and plugged into a mathematical formula. The result is an estimate of how likely a defendant is to skip a court date, be rearrested for any crime, or be rearrested for violent crime specifically, depending on the formula used. 

Proponents of risk assessment technology say that these tools will bring much needed objectivity. In theory, algorithmic tools should treat like individuals alike and minimize the risk of judicial bias in the criminal process. They also promise to provide an alternative to regressive practices like cash bail and to reduce mass incarceration by focusing efforts on those most likely to reoffend. However, their use is highly controversial for a number of reasons.

One of the biggest concerns is that the biases and inequalities that have pervaded the criminal justice system are baked into the algorithms. “Any sort of machine learning or statistical model that is making predictions is necessarily going to be built on historical data of what has happened in that system,” Dressel explained. And that historical data reflects decades of criminalization of blackness and poverty. For example, police have historically been more likely to stop, search, and arrest black persons than white persons for low-level offenses or no offenses at all. If the algorithm identifies “age of first arrest” or proxies for race like zip codes as factors that predict recidivism, this has the possibility of perpetuating the harms of those policing practices into the future.

 

Another overriding theme was the concern that these risk assessment tools will be overvalued[2] because they are “scientific.” Human decision-making elicits more skepticism; everybody knows that people can be prone to biases and errors. In contrast, when an AI risk assessment tool comes to a conclusion about someone's recidivism risk based on purportedly objective, scientific criteria, these outcomes can be seen as more accurate, even if they are really not.[3] Many times, we do not even know how accurate these tools may be, because there is no regulation of or standards for verifying accuracy. What is more, independent researchers cannot do this validation, because the algorithms are kept hidden as trade secrets.  They are “black boxes” in an area where mistakes have drastic consequences for real people’s lives.

Additionally, there is a fear these tools can mask the subjective judgments upon which they are based, providing another layer to the “black box” problem. On the front end, the developer of the algorithms must choose which factors are most pertinent to whether a certain individual will recidivate. For example, common misdemeanors like petty theft may be considered while white-collar crime like embezzlement may not be. These choices and the resulting discrepancies get solidified as “objective” truth when judges rely on these tools, thereby reinforcing the criminalization of poor, Black, and other communities. 

Professor Hellman, though, said it was important not to forget that judicial decision-making is similarly a “black box.” While she echoed the concern that algorithms tend to be overvalued because of their “scientific” character, she noted that it is difficult, if not impossible, to truly know how much weight judges give to different factors in making decisions about bond, sentencing, and the like. Judges may be influenced by factors that are just as questionable and subjective. She suggested it is ultimately a comparative question: “are we making things worse or making things better [with risk assessment tools], because the alternative isn’t a system that is free of those [same] problems.”

There was broad agreement, however, that we need more transparency. Chohlas-Wood, for example, stated that having detailed, accessible information on exactly what goes into these tools is vital. He pointed to a Wisconsin case[4] where the defendant challenged the use of gender as a factor in estimating likelihood of recidivism, noting that he would not have been able to challenge this potentially problematic category if he did not know it was being used. Similarly, Professor Eaglin argued that we need to know not just what goes into these tools, but where the data comes from, who picked it, and why.

 

Moreover, perhaps reform and transparency start in a more fundamental place: what questions are we asking about sentencing? Judge Rakoff noted that the use of these tools are predicated on the idea that we should punish people based not on what they have done, but what we think they might do in the future. While many people have celebrated the change in focus from retribution to outcome-based theories of punishment, this is not always a fair or positive thing. Chohlas-Wood elaborated that the question of risk assessment tools in criminal sentencing comes down to policy judgments about the function of sentencing and incarceration. If the goal is to prevent recidivism, “I think there is a lot of promise,” he stated. If, however, the goal is to rehabilitate, then these tools are likely not helpful. 

Algorithmic risk-assessment tools also risk dehumanizing people and limiting judges’ ability to adapt outcomes to individual circumstances. Judge Rakoff and Professor Eaglin explained that the rise of AI in judicial decisionmaking in the criminal process reflects the broader and somewhat concerning trend in recent decades of replacing judicial discretion with a framework of rules that cabin or entirely eliminate that discretion, such as the federal Sentencing Guidelines, mandatory minimum sentences, and career offender statutes. Although the Guidelines are now advisory instead of mandatory, many judges still lean heavily on them, and quite a few judges today have never known anything else, as Judge Rakoff pointed out.  Judges were “rightfully angry” when the Guidelines first came out, Professor Eaglin expressed,  because sentencing is supposed to take into account how an individual got where they are and what will best help that individual and society moving forward. The current focus on “things we can measure” thus makes judges’ jobs harder in some ways.

The role of AI and algorithms in the judicial process is still evolving, and it is likely to be a subject of debate and innovation for some time to come. Chohlas-Wood highlighted that in addition to risk assessment tools, there are other applications of these technologies that are less controversial.  For example, he noted the great success of recent programs to increase court appearances by sending automated, personalized text reminders. Dressel said that her organization is working on technology that can model the system-level impact of policies designed to reduce racial and other disparities. She suggested that AI is better suited for this sort of system-level policy research, rather than the individual determinations that risk assessments are currently used for. Professor Eaglin concluded that algorithmic risk assessments are just one of many possible ways to reduce incarceration, and they may not normatively be the way we want to approach the problem. We choose to use these tools; there are other choices.

---

ms3ru@virginia.edu
rdm9yn@virginia.edu


[1] Judge Rakoff also teaches the J-Term course “Science and the Courts,” which I highly recommend.— Rachel Martin

[2] I would like to thank Professor Schauer’s Evidence class for arming me with this knowledge.—Michael Schmid

[3] Dressler noted that one popular tool, Compass, likely had an accuracy rate of somewhere around 65%, not much better than a coin flip.

[4] State v. Loomis, 881 N.W.2d 749 (Wis. 2016).