Open up the photo app on your phone and search
“dog,” and all the pictures you have of dogs will come
up. This was no easy feat. Your phone knows what a
dog “looks” like. inform decisions about who can be set free at every
stage of the criminal justice system, from assigning
bond amounts … to even more fundamental decisions
about defendants’ freedom.”
This and other modern-day marvels are the result of
machine learning. These are programs that comb
through millions of pieces of data and start making
correlations and predictions about the world. The
appeal of these programs is immense. These machines
can use cold, hard data to make decisions that are
sometimes more accurate than a human’s. The program learned about who is most likely to end
up in jail from real-world incarceration data. And his-
torically, the real-world criminal justice system has
been unfair to black Americans.
But know, Machine Learning has a dark side. “Many
people think machines are not biased,” Princeton
computer scientist Aylin Caliskan says. “But machines
are trained on human data. And humans are biased.”
Computers learn how to be racist, sexist, and preju-
diced in a similar way that a child does, Caliskan
explains: from their creators.
We Think Artificial Intelligence is Impar-
tial. Often, it’s Not.
Nearly all new consumer technologies use machine
learning in some way. Like Google Translate: No per-
son instructed the software to learn how to translate
Greek to French and then to English. It combed
through countless reams of text and learned on its
own. In other cases, machine learning programs
make predictions about which résumés are likely to
yield successful job candidates, or how a patient will
respond to a particular drug.
Machine learning is a program that sifts through bil-
lions of data points to solve problems (such as “can
you identify the animal in the photo”), but it doesn’t
always make clear how it has solved the problem. And
it’s increasingly clear these programs can develop
biases and stereotypes without us noticing.
Last May, ProPublica published an investigation on a
machine learning program that courts use to predict
who is likely to commit another crime after being
booked systematically. The reporters found that the
software rated black people at a higher risk than whites.
“Scores like this — known as risk assessments — are
increasingly common in courtrooms across the
nation,” ProPublica explained. “They are used to
This story reveals a deep irony about machine learn-
ing. The appeal of these systems is they can make
impartial decisions, free of human bias. “If computers
could accurately predict which defendants were
likely to commit new crimes, the criminal justice sys-
tem could be fairer and more selective about who is
incarcerated and for how long,” ProPublica wrote.
But what happened was that machine learning pro-
grams perpetuated our biases on a large scale. So
instead of a judge being prejudiced against African
Americans, it was a robot.
It’s stories like the ProPublica investigation that led Cal-
iskan to research this problem. As a female computer
scientist who was routinely the only woman in her grad-
uate school classes, she’s sensitive to this subject.
Caliskan has seen bias creep into machine learning in
often subtle ways — for instance, in Google Translate.
Turkish, one of her native languages, has no gender
pronouns. But when she uses Google Translate on
Turkish phrases, it “always ends up as ‘he’s a doctor’ in
a gendered language.” The Turkish sentence didn’t say
whether the doctor was male or female. The computer
just assumed if you’re talking about a doctor, it’s a man.
How Robots Learn Implicit Bias
Recently, Caliskan and colleagues published a paper
in Science, that finds as a computer teaches itself
English, it becomes prejudiced against black Ameri-
cans and women.
Basically, they used a common machine learning pro-
gram to crawl through the internet, look at 840 bil-
lion words, and teach itself the definitions of those
words. The program accomplishes this by looking for
how often certain words appear in the same sen-
tence. Take the word “bottle.” The computer begins
SUMMER 2017 | THE DOPPLER | 67