The Doppler Quarterly Summer 2017

Open up the photo app on your phone and search “dog,” and all the pictures you have of dogs will come up. This was no easy feat. Your phone knows what a dog “looks” like. inform decisions about who can be set free at every stage of the criminal justice system, from assigning bond amounts … to even more fundamental decisions about defendants’ freedom.” This and other modern-day marvels are the result of machine learning. These are programs that comb through millions of pieces of data and start making correlations and predictions about the world. The appeal of these programs is immense. These machines can use cold, hard data to make decisions that are sometimes more accurate than a human’s. The program learned about who is most likely to end up in jail from real-world incarceration data. And his- torically, the real-world criminal justice system has been unfair to black Americans. But know, Machine Learning has a dark side. “Many people think machines are not biased,” Princeton computer scientist Aylin Caliskan says. “But machines are trained on human data. And humans are biased.” Computers learn how to be racist, sexist, and preju- diced in a similar way that a child does, Caliskan explains: from their creators. We Think Artificial Intelligence is Impar- tial. Often, it’s Not. Nearly all new consumer technologies use machine learning in some way. Like Google Translate: No per- son instructed the software to learn how to translate Greek to French and then to English. It combed through countless reams of text and learned on its own. In other cases, machine learning programs make predictions about which résumés are likely to yield successful job candidates, or how a patient will respond to a particular drug. Machine learning is a program that sifts through bil- lions of data points to solve problems (such as “can you identify the animal in the photo”), but it doesn’t always make clear how it has solved the problem. And it’s increasingly clear these programs can develop biases and stereotypes without us noticing. Last May, ProPublica published an investigation on a machine learning program that courts use to predict who is likely to commit another crime after being booked systematically. The reporters found that the software rated black people at a higher risk than whites. “Scores like this — known as risk assessments — are increasingly common in courtrooms across the nation,” ProPublica explained. “They are used to This story reveals a deep irony about machine learn- ing. The appeal of these systems is they can make impartial decisions, free of human bias. “If computers could accurately predict which defendants were likely to commit new crimes, the criminal justice sys- tem could be fairer and more selective about who is incarcerated and for how long,” ProPublica wrote. But what happened was that machine learning pro- grams perpetuated our biases on a large scale. So instead of a judge being prejudiced against African Americans, it was a robot. It’s stories like the ProPublica investigation that led Cal- iskan to research this problem. As a female computer scientist who was routinely the only woman in her grad- uate school classes, she’s sensitive to this subject. Caliskan has seen bias creep into machine learning in often subtle ways — for instance, in Google Translate. Turkish, one of her native languages, has no gender pronouns. But when she uses Google Translate on Turkish phrases, it “always ends up as ‘he’s a doctor’ in a gendered language.” The Turkish sentence didn’t say whether the doctor was male or female. The computer just assumed if you’re talking about a doctor, it’s a man. How Robots Learn Implicit Bias Recently, Caliskan and colleagues published a paper in Science, that finds as a computer teaches itself English, it becomes prejudiced against black Ameri- cans and women. Basically, they used a common machine learning pro- gram to crawl through the internet, look at 840 bil- lion words, and teach itself the definitions of those words. The program accomplishes this by looking for how often certain words appear in the same sen- tence. Take the word “bottle.” The computer begins SUMMER 2017 | THE DOPPLER | 67

The Doppler Quarterly Summer 2017 | Page 69