Tuesday, December 01, 2015

Fairness and The Good Wife

The Good Wife is a long-running legal TV show with a Chicago politics backdrop. It's one of the most popular shows on network TV, and so it was particularly fascinating to see an episode devoted in part to issues of algorithmic fairness.

It's helpful to know that the show frequently features a Google-like company called ChumHum. In this episode, they're being sued because of a feature in their maps application called "Safe Filter" that marks regions of a city (Chicago) as safe or not safe. A restaurant owner claims that she went out of business because Chummy Maps (their horribly named maps application) marked the region around her restaurant as unsafe.

The writers of this episode must have been following some of the news coverage of fairness in algorithms over the past year: a number of news items were referenced in passing. What follows is an annotated list of references.

  • "Math is not racist". This is what the character playing the COO of Chumhum says when he's first deposed, arguing that the Safe Filter uses only algorithms to determine which regions are safe. This is reminiscent of the overly uncritical article that NPR did about the use of machine learning in hiring: the CEO of Jobaline (one of the companies doing this) happily proclaimed that "math is blind"
  • "Tortious intent": Disparate impact is one vehicle that might be used to determine bias-by-algorithm. But in the episode, the lawyers argue a stronger claim, that of tortious intent, which is interesting because they then have to show deliberate racial bias. 
  • "Objective third party statistics and user generated content": The initial line of defense from Chumhum is that they use third party statistics like crime rates. The lawyers immediately point out that this could introduce bias itself. They also say they use user-generate content as a defense ("We're not racist: our users are"). This is then rebutted by the lawyers pointing out that the users of the maps app skew heavily Caucasian (bringing up another good point about how bias in training data can leech into the results)
  • "Full discovery": Chumhum wanted to hide behind its algorithm: the opposing lawyers made a successful argument for discovery of the algorithm. I doubt this could ever happen in real life, what with trade secrets and all. More on this later. 
  • "Home ownership rates as proxy for race": One subplot involved determining whether home-ownership rates were being used in the Safe Filter. The characters immediately realized that this could be a  proxy for race and could indicate bias. 
  • "The animal incident": This was a direct reference to the image-tagging fiasco of a few months ago when Google's photo app started labelling pictures of African-Americans as 'gorillas'. While at first this is a throw-away incident (including a line "Even google did it!"), it comes back later to haunt the company when a lawyer looks at the code (ha!) and discovers a patch that merely removes the 'animal' tag (instead of fixing the underlying problem). This appears to also be what Google did to "solve" its problem. 
  • "Differential ad placement": A hat tip to the work by Latanya Sweeney and the CMU team, another plot point turned on the discovery that ads in the maps application were targeting the white lawyer with ads for skiing and the black lawyer with ads for soul food.  This in and of itself was not a problem for the case, but it led to a much more fascinating argument: that Chumhum was drawing on user profile data from all its properties (search, email etc) to target ads, and so discovery could not be limited solely to maps-related data and code. This is in general the problem with asking for code to do an audit: if you don't know where the training data is coming from, the code is basically useless. Remember, an algorithm isn't always just a piece of code :)
  • "Bad training data/non-diverse workforce": One of the employee characters made the argument that the bad image tagging results were the result of "bad training data", which is an accurate statement and is part of the fairness concerns with algorithms. The lawyer also made the point that a less-homogenous workplace might have helped as well (which brings to mind the Al Jazeera panel I participated on a few months ago)
  • "IMPLICIT BIAS": I was happy when this phrase was used correctly to argue for how even "non-racist" people can help perpetuate a racist system. I would have been happier if someone had said "Disparate impact" though :). 
If you're wondering, the final resolution of the case did NOT turn on a determination of bias or not. It turned out that the restaurant had been losing money before the filter was even put into place. But it was interesting to see an example (albeit on TV) of how a court case on this might pan out. A lot of the side show involved trying to claim that programmers on the Maps app were racist (or had racist inclinations) to argue for why the code might be biased as well. 

Disqus for The Geomblog