Saturday, February 06, 2016

ITA FTW: Bayesian surprise and eigenvectors in your meal.

I've been lounging in San Diego at my favorite conference, the Information Theory and Applications workshop. It's so friendly that even the sea lions are invited (the Marine Room is where we had the conference banquet).

Sadly this year I was mired in deadlines and couldn't take advantage of the wonderful talks on tap and the over 720 people who attended. Pro tip, ITA: Could you try to avoid the ICML/KDD/COLT deadlines next time :) ?

ITA always has fun events that our more "serious" conferences should learn time. This time, the one event I attended was a Man vs Machine cookoff. Which I thought was particularly apropos since I had just written a well-received article with a cooking metaphor for thinking about algorithms and machine learning.

The premise: Chef Watson (IBM's Watson, acting as a chef) designs recipes for dinner (appetizer/entree/dessert) with an assist from a human chef. Basically the chef puts in some ingredients and Watson suggests a recipe (not from a list of recipes, but from its database of knowledge of chemicals, what tastes 'go well together' and so on. This was facilitated by Kush Varshney from IBM, who works on this project.

Each course is presented as a blind pairing of Watson and Human recipes, and its our job to vote for which one we think is which.

It was great fun. We had four special judges, and each of us had a placard with red and blue sides to make our votes. After each course, Kush gave us the answer.

The final score: 3-0. The humans guessed correctly for each course. The theme was "unusualness": the machine-generated recipes had somewhat stranger combinations, and because Watson doesn't (yet) know about texture, the machine-recipes had a different mouthfeel to them.

This was probably the only time I've heard the words 'Bayesian surprise' and 'eigenvector' used in the context of food reviews.

Thursday, February 04, 2016

Making all my reviews public (and annotated): A question.

I was reading a post on G+ about a musician who keeps all her performance reviews on her website and annotates them with a response. Not to "fight back", but to add to the reviews (that are occasionally negative).

I'm very tempted to do the same thing myself with my submissions. I think this will provide more clarity about the nature of the review process, about the often honest and revealing discussions that take place behind the peer-review walls, and about how subtleties in the writing can change the perception of a work. I suspect that as a consequence I'll be more circumspect about submitting something half-baked (which might be a good thing). I'll have to be careful not to get defensive in my responses to the reviews (which is always hard). And I may not be able to get away as easily with "changing the introduction" to get a paper in (which happens shockingly often).

Of course the biggest problem will be getting my co-authors (who are often my students) to agree beforehand. So here's my question:
Would you work with me if you knew I was planning to make all my reviews public? 

Monday, February 01, 2016

On "the moral hazard of complexity-theoretic assumptions"

(ed: In a recent CACM editorial,  CACM editor-in-chief +Moshe Vardi  discussed Babai's result on graph isomorphism and the recent hardness result for edit distance in the context of a larger discussion on the significance of complexity-theoretic assumptions. +Piotr Indyk  (one of the authors of the edit distance result and an occasional guest blogger) posted the following as a response. This response has also been posted as a comment on Moshe's post.)

(ed: Update: After Piotr posted the comment, Moshe responded, and then Piotr responded again. Please visit the article page to read the exchange between the two). 

In a recent CACM editorial, Dr. Vardi addresses what he calls a "a moral hazard of complexity-theoretic assumptions" and "a growing gap between current theory and practice of complexity and algorithms". Given that the article mentions a paper that I am a co-author of [ "Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false)", STOC'15], I believe it is appropriate for me to respond. In short, I believe that much of the analysis in the article stems from a confusion between press coverage and the actual scientific inquiry. Indeed, it is unclear whether Dr. Vardi addresses what he believes to be a "media" phenomenon (how journalists describe scientific developments to a broad public) or a "scientific" phenomenon (how and why the scientists make and use assumptions in their research and describe them in their papers). In order to avoid conflating these two issues, I will address them one by one, focusing on our paper as an example.

  1. Media aspects: The bulk of the editorial is focused on some of the press coverage describing recent scientific developments in algorithms and complexity. In particular, Dr. Vardi mentions the title of a Boston Globe article covering our work ("For 40 Years, Computer Scientists Looked for a Solution that Doesn't Exist.") . As I already discussed this elsewhere ( ), I completely agree that the title and some other parts of the article leave a lot to be desired. Among many things, the conditional aspects of the result are discussed only at the end of the article, and therefore are easy to miss. At the same time, I believe some perspective is needed. The inaccuracy or confusion in popular reporting of scientific results is an unfortunate but common and longstanding phenomenon (see e.g., this account of press coverage of the famous Khachiyan's linear programming algorithm in the 1970s). There are many reasons for this. Perhaps the chief one is the cultural gap between the press and the scientists, where journalists emphasize accessibility and newsworthiness while scientists emphasize precision. As a result, simplification in scientific reporting is a necessity, and the risk of oversimplification, inaccuracy or incorrectness is high. Fortunately, more time and effort spent on both sides can lead to more thorough and nuanced articles (e.g., see ). Given that the coverage of algorithms and complexity results in popular press is growing, I believe that, in time, both scientists and journalists will gain valuable experience in this process.
  2. Scientific aspects: Dr. Vardi also raises some scientific points. In particular:
    •  Dr. Vardi is critical of the title of our paper: "Edit Distance Cannot Be Computed in Strongly Subquadratic Time (unless SETH is false).". I can only say that, given that we are stating the assumption explicitly in the title, in the abstract, in the introduction, and in the main body of the paper, I believe the title and the paper accurately represents its contribution.
    • Dr. Vardi is critical of the validity of SETH as a hardness assumption: this question is indeed a subject of a robust discussion and investigation (see e.g., the aforementioned Quanta article). The best guess of mine and of most of the people I discussed this with is that the assumption is true. However, this is far from a universal opinion. Quantifying the level of support for this conjecture would be an interesting project, perhaps along the lines of similar efforts concerning the P vs. NP conjecture ( ). In any case, it is crucial to strengthen the framework by relying on weaker assumptions, or replacing one-way reductions with equivalences; both are subjects of ongoing research. However, even the existing developments have already led to concrete benefits. For example, failed attempts to prove conditional hardness of certain problems have led to better algorithms for those tasks.

Finally, let me point out that one of the key motivations for this line of research is precisely the strengthening of the relationship between theory and practice in complexity and algorithms, a goal that Dr. Vardi refers to as an important challenge. Specifically, this research provides a framework for establishing evidence that certain computational questions cannot be solved within concrete (e.g., sub-quadratic) polynomial time bounds. In general, I believe that a careful examination of the developments in algorithms and complexity over the last decade would show that the gap between theory and practice is shrinking, not growing. But that is a topic for another discussion.

Wednesday, January 20, 2016

On math in job talks

It's job talk season around the country, and this note from +Lance Fortnow  popped up on my twitter feed.
I was a little puzzled by this claim, and soon the following response from +Ryan Williams appeared
At which point +Lev Reyzin chimed in with
My response to all of this was:
Above all, a job talk should be comprehensible to a general CS audience, for many of whom the last algorithms class they talk was 25 years ago, and for whom IP, PCPs or even approximation algorithms are strange unknown beasts. And so I understand the sentiment expressed in Lance's post ("don't get too technical").

But avoiding formulas seems excessive. If you're expressing mathematical concepts, sometimes you need to be precise. Especially if your work hinges on being careful about these concepts. Thankfully, a good result has a nice intuitive explanation, and it's usually possible to do what Ryan suggested. For example, my favorite pitch for IP, PSPACE and the like is "A conversation is likely exponentially better than a monologue". (For why I'd be talking about IPs and PSPACE, you'll have to wait for my next post or two). 

But couple that with the formal expression! Otherwise the more intuitive explanation seems like a giant hand wave that doesn't make any sense.

One of my standard rants when I'm explaining to students how to read theory papers is that they have to be able to go fluidly between notation and intuition, and that they have to mentally translate formalisms in a paper into intuition in order to really get what the paper is saying. But if we can't present both forms in a single setting, how is any non-expert to realize that the two are related ?

In short, if you're doing a job talk where I'm in attendance, don't be shy to put in formulas, but make sure to give some intuition for what the formalism is saying.

Friday, January 15, 2016

Reading With Purpose: A grand experiment

This is the preface to a set of notes I'm writing for a seminar this semester. It will be a fun and bumpy ride !
Reading research papers in mathematical disciplines takes training. A student’s natural tendency is to read a paper from beginning to end in a depth-first manner. Every time they get stuck at a concept, they branch down another path, and then another, to the point where even getting through an abstract for a paper can take days and is exhausting.
Needless to say, this is not the best way to read technical material of any kind, mathematical or otherwise. 
I have spent many hours explaining to my students how they should be reading papers, and if you’re reading this and you have students of your own, you’ve probably done the same. While I don’t claim to have the perfect method for reading any given paper, I’ve learnt over the years how to quickly skim a paper, flying over it and scanning the terrain as it were. I’ve learnt when to make a deep dive all the way to ground level, and when to resurface. And above all, I’ve learnt to distinguish the different reasons for reading technical material in the first place, and that each mode requires a different treatment. 
I’ve looked all over the web for good material to give to students to learn how to read papers in (theoretical) computer science. I’ve found a few short articles here and there: S. Keshav’s note on reading systems papers, Michael Mitzenmacher’s helpful document, a page by Jason Eisner, and the many blog posts by Terry Tao, to name a few. There are also a number of good short articles on reading mathematics that I’ve drawn some inspiration from. 
But none of these capture the essence of what I try to achieve in my reading and what I want my students to be able to do. It’s a kind of Thurstonian ideal of understanding: where the proof falls away and what you’re left with is a deep conceptual appreciation of how the parts of a proof (or a theorem, or a set of definitions) fit together, and what they really mean. 
Moreover, if one is to teach the art of reading, it is important not just to describe the goal, but to find a way to get there: with exercises, little practices, and muscle-strengthening activities. This is after all the difference between teaching and showing.

That is my motivation for this collection of notes. It is my attempt to demystify and systematize a set of tools for reading papers. It will not be complete, and it will likely not resemble how you read papers. But hopefully it provides one way of getting to the promised land.

Thursday, January 07, 2016

SoCG-STOC Joint Workshops Call

As you all might know, in an event sure to get someone a Nobel Peace Prize, the Symposium on Computational Geometry (SoCG) and the Symposium on the Theory of Computing (STOC) will "sympose" together in Boston this year.

As part of this peace offensive designed to unite points and graphs, the combinatorial and the geometric, the Laplacian and the Hilbertian, (someone please stop me now...), there will be a day of overlap between the two conferences. The intention is to have a day of workshops on topics of interest to both communities, as well as two invited talks (by Timothy Chan and Santosh Vempala).

I'm one of the co-organizers of this joint day of workshops (together with Yusu Wang, Chandra Chekuri and Alex Andoni). And I strongly encourage you to consider putting together a proposal for a workshop/tutorial/other event that brings together folks from these communities.

For more details, please visit the workshops page. The deadline for submitting proposals is Feb 22, 2016. Proposals do not need to be very long - they merely need to answer specific questions as detailed in the CFP.

This is a great opportunity to expose the communities to topics that they might not ordinarily bump into, but could enjoy and benefit from. I'm very excited to see what ideas get tossed around, and am looking forward to the event. Kudos to the STOC and SoCG steering committees for making this happen.

Saturday, December 12, 2015

White Elephant parties and fair division

It's holiday party season, which means that it's time for white elephant parties. For those who don't know what a white elephant party is, here's a description from Wikipedia:
Each participant supplies one wrapped gift. The gifts are placed in a central location, and participants determine in which order they will take turns selecting them. The first person opens a wrapped gift, and the turn ends. On subsequent turns, each person can open a new present or gets the choice to "steal" another person's gift. The gift cannot be stolen once the third participant touches the gift (i.e. - it is stolen for the 2nd time). When a person's gift is stolen, that person can either choose another wrapped gift to open or can steal from another player. The game is over when the last person goes and the first person goes again
I just finished my algorithms class with the "traditional" (n=2) discussion of fair division algorithms (aka cake cutting, but with actual cake). White elephant parties are also a form of fair division with indivisible goods.

A few years ago, I started wondering about the theory behind white elephant exchanges and posted a question on cstheory. The answer I got (from usul) was excellent (tl;dr: worst-case definitions of fairness don't work). But it's also a few years old, and I wonder if there are new results on this topic.

Disqus for The Geomblog