Thursday, October 20, 2016

Modeling thorny problems.

Today, I got into a nice twitter discussion with former colleague and NLPer extraordinaire (even when he hates on algorithms) Hal Daumé. Here's what he said:

To which I replied:
It seemed to be worth fleshing this out a little.

In the work we've been doing in algorithmic fairness, one of the trickiest issues has been attempting to mathematically capture the different (and very slippery) notions of fairness, bias, and nondiscrimination that are used in common parlance and in the more technical literature on the subject. As Hal noted in his tweet above, these are very difficult problems for which no one really has the right answer.

How then does one make forward progress? In my view, framing is an essential part of the discussion, and one that we tend not to consider very explicitly in computer science. That is to say, deciding how to ask the question has much more impact on forward progress than the specific nature of the solutions.

This is not news if you look at politics, the law, policy discussions or the social sciences in general. But it matters a great deal as CS starts dealing with these issues because the choices we make in how to pose questions change the kinds of progress we hope to make. (while I realize that "questions matter, not solutions" is one of the oldest tropes in all of research, modeling issues like ethics and fairness requires a whole other dimension of framing -- see this nifty reframing of the editorial role of Facebook for example)

Specifically, I think of these problems as points in a partial order. It's relatively easy to talk about progress along a chain: that's what CS is really good at. But it's important to realize that we're living in a partial order, and that when we argue about different approaches, we might actually be arguing about fundamentally different and incomparable issues.

So then what happen when we reach different maximal elements (or imagine ideal maximal elements)? That's where the arguments center around beliefs, axioms and worldviews (not surprisingly, that's how we're thinking about fairness). But then the arguments are of a different nature altogether, and realizing that is very important. It's no longer an issue of what running time, what approximation ratio, what generalization bound you can get, but rather what aspects you're trying to capture and what you're leaving out.

So while it might seem that the problems we face in the larger realm of algorithms and ethics are irreducibly complex (they are) there's value in understanding and creating a partial order of formulations and making progress along different chains, while also arguing about the maximal elements.


One day, when I have nothing to do (ha!) I might attempt to write down some thoughts on how to model ill-structured problems mathematically. And it's not just about algorithmic fairness -- although that tests my skills royally. Any kind of mathematical modeling requires certain skills that we don't learn when we learn to solve crisply defined problems.

For example, one aphorism I found myself spouting some weeks ago was
Model narrowly; solve broadly. 
This was in the context of a discussion I was having with my undergraduate student about a problem we're trying to formalize. She had a nice model for it, but it was very general, and I worried that it was capturing too much and the essential nature of the problem wasn't coming through. Of course when it comes time to solve a problem though, using general hammers is a good place to start (and sometimes finish!).

Disqus for The Geomblog