Thursday, April 11, 2019

New conference announcement

Martin Farach-Colton asked me to mention this, which is definitely NOT a pox on computer systems. 
ACM-SIAM Algorithmic Principles of Computer Systems (APoCS20) 
https://www.siam.org/Conferences/CM/Main/apocs20January 8, 2020
Hilton Salt Lake City Center, Salt Lake City, Utah, USA
Colocated with SODA, SOSA, and Alenex 
The First ACM-SIAM APoCS is sponsored by SIAM SIAG/ACDA and ACM SIGACT. 
Important Dates:  
        August 9: Abstract Submission and Paper Registration Deadline
August 16: Full Paper Deadline
October 4: Decision Announcement 
Program Chair: Bruce Maggs, Duke University and Akamai Technologies 
Submissions: Contributed papers are sought in all areas of algorithms and architectures that offer insight into the performance and design of computer systems.  Topics of interest include, but are not limited to algorithms and data structures for: 

  • Databases
  • Compilers
  • Emerging Architectures
  • Energy Efficient Computing
  • High-performance Computing
  • Management of Massive Data
  • Networks, including Mobile, Ad-Hoc and Sensor Networks
  • Operating Systems
  • Parallel and Distributed Systems
  • Storage Systems

A submission must report original research that has not previously or is not concurrently being published. Manuscripts must not exceed twelve (12) single-spaced double-column pages, in addition the bibliography and any pages containing only figures.  Submission must be self-contained, and any extra details may be submitted in a clearly marked appendix. 
Steering Committee: 

  • Michael Bender
  • Guy Blelloch
  • Jennifer Chayes
  • Martin Farach-Colton (Chair)
  • Charles Leiserson
  • Don Porter
  • Jennifer Rexford
  • Margo Seltzer

Tuesday, March 26, 2019

On PC submissions at SODA 2020

SODA 2020 (in SLC!!) is experimenting with a new submission guideline: PC members will be allowed to submit papers. I had a conversation about this with Shuchi Chawla (the PC chair) and she was kind enough (thanks Shuchi!) to share the guidelines she's provided to PC members about how this will work.


SODA is allowing PC members (but not the PC chair) to submit papers this year. To preserve the integrity of the review process, we will handle PC member submissions as follows. 
1. PC members are required to declare a conflict for papers that overlap in content with their own submissions (in addition to other CoI situations). These will be treated as hard conflicts. If necessary, in particular if we don't have enough confidence in our evaluation of a paper, PC members will be asked to comment on papers they have a hard conflict with. However, they will not have a say in the final outcome for such papers.  
2. PC submissions will receive 4 reviews instead of just 3. This is so that we have more confidence on our evaluation and ultimate decision. 
3. We will make early accept/reject decisions on PC members submissions, that is, before we start considering "borderline" papers and worrying about the total number of papers accepted. This is because the later phases of discussion are when subjectivity and bias tend to creep in the most. 
4. In order to be accepted, PC member submissions must receive no ratings below "weak accept" and must receive at least two out of four ratings of "accept" or above.  
5. PC member submissions will not be eligible for the best paper award.

My understanding is that this was done to solve the problem of not being able to get people to agree to be on the PC - this year's PC has substantially more members than prior years.

And yet....

Given all the discussion about conflicts of interest, implicit bias, and double blind review, this appears to be a bizarrely retrograde move, and in fact one that sends a very loud message that issues of implicit bias aren't really viewed as a problem. As one of my colleagues put it sarcastically when I described the new plan:

"why don't they just cut out the reviews and accept all PC submissions to start with?"
and as another colleague pointed out:

"It's mostly ridiculous that they seem to be tying themselves in knots trying to figure out how to resolve COIs when there's a really easy solution that they're willfully ignoring..."

Some of the arguments I've been hearing in support of this policy frankly make no sense to me.

First of all, the idea that a more heightened scrutiny of PC papers can alleviate the bias associated with reviewing papers of your colleagues goes against basically all of what we know about implicit bias in reviewing. The most basic tenet of human judgement is that we are very bad at filtering our own biases and this only makes it worse. The one thing that theory conferences (compared to other venues) had going for them regarding issues of bias was that PC members couldn't submit papers, but now....

Another claim I've heard is that the scale of SODA makes double blind review difficult. It's hard to hear this claim without bursting out into hysterical laughter (and from the reaction of the people I mentioned this to, I'm not the only one).  Conferences that manage with double blind review (and PC submissions btw) are at least an order of magnitude bigger (think of all the ML conferences). Most conference software (including easy chair) is capable of managing the conflicts of interest without too much trouble. Given that SODA (and theory conferences in general) are less familiar with this process, I’ve recommended in the past that there be a “workflow chair” whose job it is to manage the unfamiliarity associated with dealing the software. Workflow chairs are common at bigger conferences that typically deal with 1000s of reviewers and conflicts.

Further, as a colleague points out, what one should really be doing is "aligning nomenclature and systems with other fields: call current PC as SPC or Area Chairs, or your favorite nomenclature, and add other folks as reviewers. This way you (i) get a list of all conflicts entered into the system, and (ii) recognize the work that the reviewers are doing more officially as labeling the PC members. "


Changes in format (and culture) take time, and I'm still hopeful that the SODA organizing team  will take a lesson from ESA 2019  (and their own resolution to look at DB review more carefully that was passed a year or so ago) and consider exploring DB review. But this year's model is certainly not going to help.

Update: Steve Blackburn outlines how PLDI handles PC submissions (in brief, double blind + external review committee)

Update: Michael Ekstrand takes on the question that Thomas Steinke asks in the comments below: "How is double blind review different from fairness-through-blindness?".

Tuesday, February 19, 2019

OpenAI, AI threats, and norm-building for responsible (data) science

All of twitter is .... atwitter?... over the OpenAI announcement and partial non-release of code/documentation for a language model that purports to generate realistic-sounding text from simple prompts. The system actually addresses many NLP tasks, but the one that's drawing the most attention is the deepfakes-like generation of plausible news copy (here's one sample).

Most consternation is over the rapid PR buzz around the announcement, including somewhat breathless headlines (that OpenAI is not responsible for) like

OpenAI built a text generator so good, it’s considered too dangerous to release
or
Researchers, scared by their own work, hold back “deepfakes for text” AI
There are concerns that OpenAI is overhyping solid but incremental work, that they're disingenuously allowing for overhyped coverage in the way they released the information, or worse that they're deliberately controlling hype as a publicity stunt.

I have nothing useful to add to the discussion above: indeed, see posts by Anima Anandkumar, Rob MunroZachary Lipton  and Ryan Lowe for a comprehensive discussion of the issues relating to OpenAI.  Jack Clark from OpenAI has been engaging in a lot of twitter discussion on this as well.

But what I do want to talk about is the larger issues around responsible science that this kerfuffle brings up. Caveat, as Margaret Mitchell puts it in this searing thread.

To understand the kind of "norm-building" that needs to happen here, let's look at two related domains.

In computer security, there's a fairly well-established model for finding weaknesses in systems. An exploit is discovered, the vulnerable entity is given a chance to fix it, and then the exploit is revealed , often simultaneously with patches that rectify it. Sometimes the vulnerability isn't easily fixed (see Meltdown and Spectre). But it's still announced.

A defining characteristic of security exploits is that they are targeted, specific and usually suggest a direct patch. The harms might be theoretical, but are still considered with as much seriousness as the exploit warrants.

Let's switch to a different domain: biology. Starting from the sequencing of the human genome through the million-person precision medicine project to CRISPR and cloning babies, genetic manipulation has provided both invaluable technology for curing disease as well as grave ethical concerns about misuse of the technology. And professional organizations as well as the NIH have (sometimes slowly) risen to the challenge of articulating norms around the use and misuse of such technology.

Here, the harms are often more diffuse, and the harms are harder to separate from the benefits. But the harm articulation is often focused on the individual patient, especially given the shadow of abuse that darkens the history of medicine.

The harms with various forms of AI/ML technology are myriad and diffuse. They can cause structural damage to society - in the concerns over bias, the ways in which automation affects labor, the way in which fake news can erode trust and a common frame of truth, and so many others - and they can cause direct harm to individuals. And the scale at which these harms can happen is immense.

So where are the professional groups, the experts in thinking about the risks of democratization of ML, and all the folks concerned about the harms associated with AI tech? Why don't we have the equivalent of the Asilomar conference on recombinant DNA?

I appreciate that OpenAI has at least raised the issue of thinking through the ethical ramifications of releasing technology. But as the furore over their decision has shown, no single imperfect actor can really claim to be setting the guidelines for ethical technology release, and "starting the conversation" doesn't count when (again as Margaret Mitchell points out) these kinds of discussions have been going on in different settings for many years already.

Ryan Lowe suggests workshops at major machine learning conferences. That's not a bad idea. But it will attract the people who go to machine learning conferences. It won't bring in the journalists, the people getting SWAT'd (and one case killed) by fake news, the women being harassed by trolls online with deep-fake porn images. 

News is driven by news cycles. Maybe OpenAI's announcement will lead to us thinking more about issues of responsible data science. But let's not pretend these are new, or haven't been studied for a long time, or need to have a discussion "started".


Monday, January 28, 2019

FAT* Session 2: Systems and Measurement.

Building systems that have fairness properties and monitoring systems that do A/B testing on us.

Session 2 of FAT*: my opinionated summary.

Sunday, January 27, 2019

FAT* blogging

I'll be blogging about each session of papers from the FAT* Conference. So as not to clutter your feed, the posts will be housed at the fairness blog that I co-write along with Sorelle Friedler and Carlos Scheidegger.

The first post is on Session 1: Framing and Abstraction.

Thursday, December 20, 2018

The theoryCS blog aggregator REBORN

(will all those absent today please email me)

(if you can't hear me in the back, raise your hand)

The theoryCS blog aggregator is back up and running at its new location -- cstheory-feed.org -- which of course you can't know unless you're subscribed to the new feed, which....

More seriously, we've announced this on the cstheory twitter feed as well, so feel free to repost this and spread the word so that all the theorists living in caves plotting their ICML, COLT and ICALP submissions will get the word. 

Who's this royal "we"? Arnab Bhattacharyya and myself (well mostly Arnab :)). 

For anyone interested in the arcana of how the sausage (SoCG?) gets made, read on: 

Arvind Narayanan had set up an aggregator based on the Planet Venus software for feed aggregation (itself based on python packages for parsing feeds). The two-step process for publishing the aggregator works as follows:
  1. Run the software to generate the list of feed items and associated pages from a configuration file containing the list of blogs
  2. Push all the generated content to the hosting server. 
Right now, both Arnab and I have git access to the software and config files and can edit the config to update blogs etc. The generator is run once an hour and the results are pushed to the new server. 

So if you have updates or additions, either of us can make the changes and they should be reflected fairly soon on the main page. The easiest way to verify this is to wait a few hours, reload the page and see if your changes have appeared. 

The code is run off a server that Arnab controls and both of us have access to the domain registry. I say this in the interest of transparency (PLUG!!) but also so that if things go wonky as they did earlier, the community knows who to reach. 

Separately, I've been pleasantly surprised at the level of concern and anxiety over the feed -- mainly because it shows what a valuable community resource the feed is and that I'm glad to be one of the curators. 

If you've read this far, then you really are interested in the nitty gritty, and so if you'd like to volunteer to help out, let us know. It would be useful for e.g to have a volunteer in Europe so that we have different time zones covered when things break. And maybe our central Politburo (err. I mean the committee to advance TCS) might also have some thoughts, especially in regard to their mission item #3:
To promote TCS to and increase dialog with other research communities, including facilitating and coordinating the development of materials that educate the general scientific community and general public about TCS.

Disqus for The Geomblog