In “Talking To Strangers” by Malcolm Gladwell, Gladwell remarks that humans are consistently and predictably terrible at judging others. To illustrate, Gladwell references an artificial intelligence (AI) bail sentencing computer program that considers certain criteria of the person of deliberation, then returns it’s assessment on whether or not the person at hand is likely to commit another crime while out on bail. The facts are striking yet perhaps unsurprising. The AI algorithm better predicts whether arrested persons are likely to re-offend when out on bail, and thus which people should not be given bail. In short, the computer outperforms the sentencing judge.
The judiciary process is set on top of the premise that in-person-judgement in the courtroom helps in accurate decision making. But humans are filled with bias. And what’s more, humans can also be very good at deceiving (Gladwell 2019). Does this mean, then, that we should be using AI to determine what we should do when it comes to the judgement and placement of people? Or, for that matter, for any decisions that may affect humanity?
Unfortunately, there are countless examples in which AI algorithms create inequities. For one, while at graduate school, Joy Buolamwini sought to create a mirror that projects various “heroes” on her own face. In doing so, she uncovered something alarming. The mirror wouldn’t recognize her own face unless she placed a white mask over it. She then examined the facial recognition technology at Amazon, Google, Microsoft, and IBM and found that error rates within this tech for white men were at less than 1%, yet for black and brown females, this error rate was at a whopping 30%. Buolamwini continued her studies to find that this difference in error rates was largely due to the fact that the data set of imagery was mostly white men (Shacknai 2022). And this is no shock given that performing a quick Google image search of “men”, we largely see white men.
“When we think about transitioning into the world of tech, the same things that are being marginalized and ignored by the conversations we have around racial inequality in the U.S.—skin tone and colorism—are also being marginalized and ignored in the tech world” - Harvard University Professor Dr. Monk1
Similarly, predictive policing2 makes use of past policing data to predict which neighborhoods to police next. While this methodology may be seemingly “unbiased” since it circumvents biased people in the decision making process, the reality is just the opposite. This is due to the fact that policing data is systemically biased and racist. As a result, predictive policing tools can create vicious feedback loops in which neighborhoods that were in the past more frequently policed due to racism and biases are now more policed, which inevitably leads to more arrests/infractions, and therefore more data for predictive policing models to point even more to those very same neighborhoods (Lum and Isaac 2016).
This is because “raw data” is not, in fact, raw, in the sense that it is unbiased. It is grounded in the inequities present at the time. For that reason, if we are to use AI to help us with decision making, it is imperative that we re-contextualize our data to recognize the biases at play to help that our model-inputs are no longer biased (Gitelman 2013).
But how can we ensure our raw data is re-contextualized and bias-free? One possible mitigation strategy is to utilize datasheets for datasets for any data used in our models (Gebru et al. 2018). In “Datasheets for Datasets” by Timnit Gebru, Gebru brings to light how if we provide a plethora of information about every dataset, we can help avoid unintended consequences in the use of that data. “Datasheets for Datasets” provides a plethora of potential provocative questions to provide about our datasets, such as:
“For what purpose was the dataset created?”
“Who funded the creation of the dataset?”
“Does the dataset contain all possible instances or is it a sample (not necessarily random of instances) from a larger set?”
“Is any information missing from individual instances?”
“Who was involved in the data collection process (e.g., students, crowdworkers, contractors) and how were they compensated (e.g., how much were crowdworkers paid)?”
When we have datasheets comprehensively describing our data, we are able to more accurately identify biases for AI computer model building.
And to that point, our AI models themselves should also come along with datasheets for increased transparency, better model comprehension, and ultimately, better model quality checks. These more transparent AI models are often referred to as “white box” models, which are so transparent such that any outside observer can clearly see how automated AI models arrive at the decision. In this way, it would be much easier to see if the model is making recommendations rooted in bias. “Black box” models do not have this sort of transparency and thus it is nearly impossible to see if the model is using incorrect or unjust assumptions as part of it’s methodology (McNally, n.d.). When we think of how AI models have the ability to morph as the data that the model is continuously trained on comes in, white box models have even more weight. This is because the method that a model was once using and was once understood, may be drastically different from the method used today. For this reason, it is greatly important to perform consistent checks on model assumptions.
So - back to the original question: should AI be used for decision making?
All in all, AI will be as unjust as the “unjustness” that goes in. But with correct input data and model checks and balances, it is certainly possible to create AI algorithms that are at least less biased than the average human. Algorithms may never be perfect, but that is exactly why it is so important for our models and input data to be transparent and open for diverse sets of stakeholders to scrutinize.
References
Footnotes
Predictive policing is the use of computer algorithms to predict which neighborhoods to police.↩︎
Citation
@online{cutler2022,
author = {Victoria Cutler},
editor = {},
title = {Ethical {AI} for {Just} {Decision} {Making}},
date = {2022-12-07},
url = {https://victoriacutler.github.io/posts/2022-10-24-url-title/},
langid = {en}
}