passing peer review is better understood as saying a paper is not obviously wrong, not obviously redundant and not obviously boring, rather than as saying it's correct, innovative and important.Of course, the contrapositive of this is that an obviously wrong, redundant and boring paper has some nonzero chance of being rejected ;)
Ruminations on computational geometry, algorithms, theoretical computer science and life
Friday, November 30, 2007
The true meaning of peer review
Interesting new blogspam method ....
A new form of blog plagiarism combines the best of spam techniques with post copying, so as to evade the obvious duplicate detection methods. Consider the following examples:
The original post:
There's been a lot of grief over the reduction in price for iPhones only a few months after they were released. Wired magazine interviewed people who don't regret paying the higher price for the virtue of being an early adopter/arbiter-of-cool, and this comment caught my eyeThe modified post (and no, I'm not going to link to it):
There's been a number of visibility at the decline master p scholomance for online in very couple weeks and we were released. Wired magazine interviewed people who don't regret paying a low frequency of the world as both an early riser and perhaps comment and the eyeThe original post:
Suppose you're given a metric space (X, d) and a parameter k, and your goal is to find k "clusters" such that the sum of cluster costs is minimized. Here, the cost of a cluster is the sum (over all points in the cluster) of their distance to the cluster "center" (a designated point).The modified post:
Lil romeo you're given a multidimensional space (X, d) and a parameter k, and a buff being use find k "clusters" such as lump- end of the effect is minimized. Here, lil romeo use 70million a target wikipedia, the sum of these items out the amount of the need codify the cluster "center" (a designated point).And so on. What's more mysterious is that this isn't straight plagiarism: the post credits my post as the "source", although given the garbage that's generated, I'm not sure I want that "source" credit :).
Very mystifying...
Saturday, November 24, 2007
Black Friday -- online
Tuesday, November 20, 2007
Author contributions
- Alphabetical (the norm in theoryCS and communities heavily influenced by it)
- Advisor-student: student name comes first, advisor comes last.
- Work-ordering: first author did all the work, last author provided the funding, intermediate authors ordered by contribution (common in systems work)
- alphabetical ordering conceals who did the real work ("but the cream rises to the top")
- Work ordering allows advisors to slap their names on anything coming out of their lab ("But they did all the work needed to fund this research: grant writing, selling, etc")
- alphabetical gives undue eminence to authors whose names begin with 'A', (but see here for a counter-theory)
- alphabetical ordering makes authors with names in the end of the alphabet lazy (Who're you looking at, punk ?)
- theory papers can't be work-ordered, because ideas "float in the air without ownership" (yes, I have heard this argument many times, so don't snicker)
He also links to an interesting set of axioms that Hardy and Littlewood developed before their famous collaboration started. I actually admire this approach. As a general rule, theory folks tend to be very allergic towards detailed discussions about contributions and what-not, but in collaborations, it's critically important to lay out the ground rules in advance, and the Hardy-Littlewood axioms are good because they lay out rules in advance that eliminate any problems later on: for example,
And, finally, the fourth, and perhaps most important axiom, stated that it was quite indifferent if one of them had not contributed the least bit to the contents of a paper under their common name . . .Agreeing to such an axiom requires great trust between the collaborators, because of the immense potential for abuse, and maybe that's exactly the point: a good collaboration requires trust, and you can't agree to such an axiom unless trust already exists.
Monday, November 19, 2007
My first ipelet: 1-medians
The ipelet is for computing the discrete and geometric 1-median of a set of points. The geometric 1-median of a set of points (also called the Fermat-Weber-Steiner-Torricelli-< insert your name here > point) is the point that minimizes the sum of distances to a set of points. The discrete 1-median is the same minimization, restricted to points in the input.
Computing the discrete 1-median of a set of points in the plane is trivial in n^2 time, and I didn't do anything clever there. Computing the geometric 1-median is more nontrivial though: there is no closed form solution beyond 4 points, and it can be shown that the point cannot be computed using arithmetic and k^th roots.
There are various approximation schemes: the most pertinent one here (in the plane) is a scheme by Bose, Maheshwari and Morin that uses a cone decomposition of the plane to yield an approximation scheme. In practice, researchers often use an iterative scheme developed by Weiszfeld; it's a simple iterative scheme built around the fact that the optimal solution must be a point such that the sum of unit vectors from it to all other points is zero (formally, this is the condition on the gradient).
The cost function is convex and differentiable except at the input points themselves. Thus, the Weiszfeld iteration is guaranteed to give the optimal solution as long as it doesn't get stuck at one of the data points. This algorithm is often referred to as the "founding algorithm of location science", and there are numerous modifications, extensions, and refinements. The ipelet implements the basic Weiszfeld scheme with a fairly brain-dead termination condition.
What possessed me to do such a thing ? Well, I'm trying to find counter examples for a conjecture involving 1-medians, and needed a way of visualizing examples :).
The code is here: feel free to use it, comment on it, modify it, etc.
IKEA = NP ?
The sad truth is, the naming conventions for furniture at Ikea make for a more consistent language than those of complexity theory.Ouch ! For more on this, read Lance's post from way back when.
Tuesday, November 13, 2007
A day in the life...
- I taught one lecture of my class
- I attended a departmental committee meeting
- I had a meeting with collaborators to work out a paper outline for something we're submitting
- I met with my student and did some nontechnical advising
- I had a (brief) discussion with a collaborator about a pending grant proposal.
Nowhere in there was any actual research done ! Gaaaah !
Tuesday, November 06, 2007
"As good as your best result"
I've actually heard the opposite claim made by systems researchers, and this has been extended to other empirical researchers as well. Namely, "you're as good as your worst result" (presumably thresholded to ignore grad school). The rationale here appears to be that in the conservative empirical sciences, where a badly designed experiment can cast a shadow on all your later work, it's your worst result that matters.
I can see how this works (sort of): in the more mathematical disciplines, a "proof" can to a large extent be validated independent of the author, so mistakes in past proofs can be tolerated (though if this becomes an endemic problem, then ...). Louis de Branges is famous for his proof of a conjecture in complex analysis known as the Bieberbach conjecture, and is currently well known for his claimed proof of the Riemann Hypothesis (Karl Sabbagh's book on this has more on de Branges). His proof of the Bieberbach conjecture was not without some initial skepticism from other mathematicians, because of some earlier false starts. However, it was soon seen as correct, and has opened up new areas of mathematics as well. As a consequence, his proofs of the Riemann hypothesis have received more cautious consideration as well (although I am not aware of the current status of his claims).
On the other hand, validating empirical work is largely on trust: you trust that the scientists have made faithful measurements, have kept proper lab notes etc, and I can see how a weakening of that trust can lead to much higher skepticism.
In this light, it's interesting to read Malcolm Gladwell's latest article for the New Yorker. He profiles the "profilers": the psych experts who help the FBI track down serial killers, and comes to this conclusion:
He seems to have understood only that, if you make a great number of predictions, the ones that were wrong will soon be forgotten, and the ones that turn out to be true will make you famous.
Sunday, November 04, 2007
While I wait for inspiration to strike..
What a great idea ! PostSecret for the researcherati....
Wednesday, October 24, 2007
Catchy: "Put the Turing in Manufac-turing"
But U.S. manufacturing today is where database technology was in the early 1960's, a patchwork of ad hoc solutions that lacked the rigorous methodology that leads to scientific innovation. That all changed in 1970 when Ted Codd, an IBM mathematician, invented relational algebra, an elegant mathematical database model that galvanized federally funded research leading to today's $14 billion database industry.
Manufacturing needs the same treatment. Just as the method to add two numbers together doesn't depend on what kind of pencil you use, manufacturing abstractions can be wholly independent of the product one is making or the assembly line systems used to assemble it. Another precedent is the Turing Machine, an elegant abstract model invented by Alan Turing in the 1930s, which established the mathematical and scientific foundations for our now-successful high-tech industries. Without Turing's theoretical work, the system that typeset this line wouldn't exist.
What's needed today is an analogy to the Turing Machine for design, automation and manufacturing. Recent developments in computing and information science have now made it possible to model and reason about physical manufacturing processes, setting the stage for us to "put the Turing into Manufacturing". The result, as was the case with databases and computers, would be higher quality, more reliable products, reduced costs, and faster delivery
As they say, RTWT.
Untangling a geometric graph
If we put on our theory hats, the first question that pops out is: how many vertices might you need to move in order to move from an embedding of a planar graph to a planar embedding ? This process, of moving vertices around to remove crossings, is called untangling. Technically, H is an untangling of G if H is crossing free and is isomorphic as G (in other words, untangling might allow more general moves than the sliding allowed by Planarity). A vertex of G was said to be fixed if it was mapped (by the isomorphism) to a vertex of H with the same coordinates.
Back in 1998, it was first asked whether a graph could be made planar while keeping a constant fraction of vertices fixed. The answer, shown by Pach and Tardos, was no, via a graph in which at most roughly n^{2/3} of the vertices could stay fixed (which is one way of explaining the difficulty of Planarity). Of course, the next question was whether it was the case that all planar graphs could be untangled by fixing at least some n^epsilon vertices.
A recent paper by Bose, Dujmovic, Hurtado, Langerman, Morin and Wood answers this question in the affirmative, showing that any geometric planar graph can be untangled while keeping at least n^{1/4} vertices fixed. The proofs involved are intricate, but are graph-theoretic and "elementary" in nature.
Wolfram (2,3) machine exists
Read more about here, and here.
(via BB)
Monday, October 15, 2007
I foresee mass murders of cell phone users...
It’s been a long time, but Led Zeppelin, one of the last superstar acts to refrain from selling its music online, is finally offering its catalog to digital-music fans. The shift by Led Zeppelin, whose reunion concert in London next month has already incited a frenzy for tickets, highlights the clout of digital sales in the music market as mass merchants reduce the shelf space devoted to compact discs. Under a series of new agreements expected to be announced today, the band will make its songs available first as ringtones and similar mobile features starting this week in an exclusive deal with Verizon Wireless.And if you don't believe me, watch Wayne's World.
Thursday, October 11, 2007
Popularizing computing
Two recent articles of his are exemplars:
1. Sorting out the genome, from September-October 2007. Here's a key extract:
Given two arrangements of a set of genes—say a b c d e f g and f e b a g c d—how do you determine what sequence of reversals produced the transformation? This example has a three-step solution, which you can probably find in a few minutes with pencil and paper. For larger genomes and longer chains of reversals, however, trial-and-error methods soon falter. Is there an efficient algorithm for identifying a sequence of reversals that converts one permutation into another?The article proceeds to lay out the entire history of sorting with reversals, pancake sorting, and related problems. It's a great case study for the study of algorithms in an application domain: how to model a problem, come up with algoritms, discover that the model doesn't quite capture what you want, rethink the model, come up with more algorithms, and so on. It touches on dynamic programming, greedy algorithms, NP-hardness, and approximations. I liked it so much I'll be devoting a lecture in my algorithms class to it, to illustrate how algorithms research works "in the real world".
The genetic reversal problem lies at the intersection of biology, mathematics and computer science. For some time, the prospects for finding a simple and efficient solution seemed dim, even with the most powerful tools of all three disciplines. But the story has a happy ending. A little more than a decade ago, computing gene reversals was still a subtle research problem; now it can be done with such ease that it's a matter of routine technology. If you need to know the "reversal distance" between two genomes, you can go to a Web site and get the answer in seconds.
2. Conquering divide, bit-player, October 9th, 2007.
The "hook" for this article is is Eric Allender's review article on the true complexity of division (short summary: Division is complete for DLOGTIME-uniform TC^0), which is a must-read in its own right. What Hayes discusses in his article is the reason to even ask this question, starting with the general lack of symmetry between grade-school methods for multiplication and division. I found this particularly insightful:
Why is division so hard?
At times I feel this is a dumb question, and that the universe doesn’t owe us an answer to such questions. Some things are just harder than others; that’s the way the world works. We can count ourselves lucky that multiplication has a polynomial-time algorithm; we have no right to expect that our luck will hold when we turn to division.
But then I get to thinking about physical systems that multiply and divide, where the two operations seem perfectly symmetrical. An example is an electrical circuit governed by Ohm’s law.
If we attach the terminals of a voltmeter to opposite ends of a resistor, we observe Ohm’s law in the form E = IR: Voltage is equal to the product of current and resistance. If we install an ammeter in series with the resistor, we see the effects of the law I = E/R: Current is equal to voltage divided by resistance. If you prefer pipes to wires, you can do the hydraulic version of the experiment, and there are also lots of mechanical schemes, where the same principle is illustrated with levers, gears, pulleys and such. In all of these systems, nature seems to divide just as easily as it multiplies. Why can’t we do the same with pencil and paper, or transistors, or neurons?
Saturday, October 06, 2007
CoM #18
Well done !
Sunday, September 30, 2007
"Related Work" vs "Prior Work"
"Related work" or "Prior work" ? I've seen both section titles used when describing all the work that came before the magnificient specimen of research that I'm reading right now. Personally, I prefer 'Prior work': it seems somehow closer to the spirit of 'I stand on the shoulders of giants', even if that was really just a dig at Hooke. 'Related Work' to me has the tone of a grudging 'yeah yeah, these guys did something that might vaguely resemble what we do here, and I guess I should include it, but don't imagine that their work in any way influences my brilliant thoughts, which are sui generis'.
An issue that appears to be related to this is the placement of the Related/Prior Work section. I personally prefer to have prior work discussions just after the introduction, unless discussing prior work requires the use of notation that I introduce in a definitions section. In many papers (and some of mine as well), the prior work gets shunted off to the no-man's land before the conclusions. I wonder if there's some correlation between the choice of the title, 'Related Work' and the banishing of the section to the very end, a kind of double dismissal of the value of the cited works.
Wednesday, September 26, 2007
The "Voronoi trick"
It's a very interesting paper, and at the risk of sounding patronizing, I have to say that I'm impressed at their use of power diagrams and the Voronoi trick, something that many theoryCS folks may not have encountered. Once again, the problem they're looking at is redistricting: can you design voting districts to satisfy certain fairness criteria ?
One specific question that they ask is: can you evaluate the quality of a state's redistricting plan ? Their answer is in the form of an approximation ratio. They fix a measure of quality for a redistricting, and take the ratio of the actual plan cost to the cost of the optimal plan. Their cost, in this case, is a well known clustering measure: the cost of a cluster (in this case, a district) is the all-pairs sum of distances squared, and the cost of a clustering (in this case, a redistricting) is the sum of cluster costs. One extra twist is that they require all clusters to be equally sized upto rounding errors. That is, a 2-clustering of 9 people must have one cluster with 5 and one cluster with 4 people.
The problem of computing an optimal clustering is NP-hard if k, the number of clusters, is part of the input. However, if not, then there's an elegant method for solving these kinds of problems which I'll call the "Voronoi trick".
The Voronoi Trick:
A clustering of n items into k clusters is a partitioning problem, and as such, there are roughly k^n ways of assigning items to clusters (k^n comes from treating all clusters as distinct; in general, the amount might be reduced by as much as k!). But for geometric problems, we fully expect that not all partitions are likely to be optimal, because of proximity constraints and the triangle inequality. Is there some way of restricting the set of possible partitions ?
Consider the following set of possible partitions of a set of n points in a Euclidean space. We put down k centers, and assign each point of the input to its nearest center. Now not all placement of k centers lead to distinct partitions: I can always perturb one center slightly without changing the combinatorial structure of the partition. We'll call such a partition a Voronoi partition, from the connection to Voronoi diagrams.
Now it's clear that not all partitions of the input are realizable as Voronoi partitions. For example, if we consider k=2 points in the plane, then there are only O(n^2) distinct partitions, rather than the 2^n ways of partitioning a set of items into two groups. In general, for a set of n points in d dimensions, there are roughly n^{kd} distinct Voronoi partitions, which is polynomial in n, rather than being exponential.
Now here comes the trick: for many clustering problems, including the one mentioned above, we can show that the optimal clustering must be realizable as a Voronoi partition. This kind of result has appeared in many different forms: Inaba et al showed that for two different kinds of minimum variance clustering (one where the cost of a cluster is the sum of squared distances from the center, and the other where the cost is the pairwise sum of squared distances of points in the cluster), the optimal clustering is a Voronoi partition. Other forms of this trick were used by Aurenhammer et al in their paper on least-squares partitioning where they showed that we could compute the optimal RMS matching between two point sets using Voronoi partitions.
One aspect of this trick that I elided: often, the Voronoi partitioning doesn't come from using a straight distance function, but from using a weighted distance function. One way this is done is to associate a weight w(s) with each site, and define the Voronoi bisector as the set of points satisfying d(x,s) - w(s) = d(x,s') - w(s'), rather than the usual d(x,s) = d(x,s'). If you recall the view of a Voronoi diagram as the lower envelope of a collection of cones in one higher dimension, with apexes at the site locations, then a weighted Voronoi diagram corresponds to lifting the cones by different (even negative) heights, corresponding to the weight function w(s). This diagram is also called a power diagram.
The Voronoi trick and clustering:
Apart from the fact that the Voronoi trick allows us to reduce the space of possible partitions to consider, it also allows us to isolate the influence of points on clusters. This idea was first exploited in the above-mentioned paper by Inaba et al, and has since been used in approximate clustering work by Badoiu, Har-Peled and Indyk, as well as the recent linear-time k-means algorithms by Kumar, Sabharwal and Sen. The argument goes something like this:
Since we know that each center is "influenced" by points in its Voronoi cell, we can focus on the problem of finding a single center out of the k. This is typically solved using a (simpler) 1-median or 1-means algorithm, run on a random sample of points that is likely to contain a large number of points from a single cluster. Once this is done, we can "peel" off this center, and repeat. The ability to do random sampling and peeling comes from the idea of irreducibility: namely, that either a k-clustering can be approximated using a k-1-clustering, or if not, it means that the k^th cluster is "reasonably well-separated" from the rest, and can be isolated using the above ideas.
Returning to redistricting:
Coming back to the redistricting paper, what they do is essentially show that their all-pairs measure of the cost of a cluster reduces to the 'cost to the centroid' measure, since all clusters are required to have the same size. In this case, they can make use of the power diagram to search efficiently for solutions. If I may nitpick, I wonder why they didn't look into the many approximation schemes available for these problems (but this is really a nitpick).
Postscript:
There are other tricks that are also worth of being called a 'Voronoi trick'. One such idea dates back to a paper by Dobkin, Drysdale and Guibas from 1983 ("D.P. Dobkin, R.L. Drysdale, IIi, and L.J. Guibas. Finding smallest polygons. In Advances in Computing Research, Voi. 1, JAI Press, 1983, 181-214."), and works as follows:
Suppose you want to find the k points from a point set of size n that optimize some function: for example, you want the k points with the smallest enclosing ball, or the k points with the smallest convex hull area, and so on. Further, let's say you have some expensive algorithm that can solve this problem in time O(n^c). For many of these functions, it can be shown that the optimal set of k points must lie within a single cell of the O(k)^th order Voronoi diagram of the set of points. In case you hadn't encountered this notion before, the k^th order Voronoi diagram is what you get if instead of taking the lower envelope of your cones, you take the k^th level of the arrangement. Intuitively, it consists of cells in which for any point, its k nearest neighbours are a fixed set of points.
The k^th order Voronoi diagram of points in the plane has complexity O(kn). Using the above fact, we can enumerate over all cells, and run the expensive algorithm over the O(k) points in each cell. This gives a running time of O(k^c * kn = k^{c+1} n), rather than n^c. This idea works upto a point: see this paper by fellow bloggers David and Jeff for improvements that bypass the Voronoi construction in favor of iterated nearest neighbors.
Friday, September 21, 2007
Manifold learning and Geometry
Given a collection of (unlabelled) data inhabiting some high dimensional space, can you determine whether they actually lie on some lower dimensional manifold in this space ?This talk was a great window into an area of machine learning with strong geometric connections, an area where geometers could profitably play and make useful contributions.
Now NIPS (one of the major machine learning conferences) is returning the favor, with a workshop devoted to the topic of Topology Learning:
The list of invited speakers bridges geometry, topology and learning. It looks like a great forum for continuing the machine learning-computational geometry cross-talk that Partha kicked off at SoCG.There is a growing interest in Machine Learning, in applying geometrical and topological tools to high-dimensional data analysis and processing.
Considering a finite set of points in a high-dimensional space, the approaches developed in the field of Topology Learning intend to learn, explore and exploit the topology of the shapes (topological invariants such as the connectedness, the intrinsic dimension or the Betti numbers), manifolds or not, from which these points are supposed to be drawn.
Applications likely to benefit from these topological characteristics have been identified in the field of Exploratory Data Analysis, Pattern Recognition, Process Control, Semi-Supervised Learning, Manifold Learning and Clustering.
However it appears that the integration in the Machine Learning and Statistics frameworks of the problems we are faced with in Topology Learning, is still in its infancy. So we wish this workshop to ignite cross-fertilization between Machine Learning, Computational Geometry and Topology, likely to benefit to all of them by leading to new approaches, deeper understanding, and stronger theoretical results about the problems carried by Topology Learning.
Thursday, September 20, 2007
Modelling via the algorithmic lens
Such work however can be an endless source of frustration. Most commonly, there are cultural difference between areas, and terminology barriers that need to be overcome. But most of all, there's a sense of apathy towards any effort at producing formal guarantees for computational methods (either for running times, or quality). In general, methods based on formal algorithmic analysis still prove themselves in the experimental setting, rather than possessing any intrinsic value by virtue of being correct, having running time and quality guarantees etc.
This in itself can be defended: if the main bottlenecks towards providing an effective practical solution to a problem are not computational (which they often are though), then it's not clear that a practitioner needs to care about esoteric notions like formal guarantees (which can often be quite weak). What is harder to surmount is the 'I've dug my hole and I like it' phenomenon.
For example, the RMS (root-mean-square) distance is commonly used to compare two 3D protein models. But I've never heard any fundamental argument as to why this measure is superior to other possible candidates for matching shape. What's worse, the RMS measure is harder to minimize (under transformations) than other equally plausible measures of shape similarity. However, as the status quo, it's hard to do anything to change this, short of voluminous studies that argue for the relative merits of a different measure based on comparisons on shapes.
Lest it seem that I'm merely whining, Barbie-style, because 'Modelling is hard', let me summarize my main point:
The computational efficacy and guarantees associated with a particular modelling technique are not in and of themselves sufficient to convince a practitioner that this approach is valid, even if in all other respects, the method is identical to other, more entrenched methods.
This makes me wonder about the best way to do algorithmic modelling, if one chooses to do it at all.
Sunday, September 16, 2007
Sunday afternoon MathTubing
The videos are listed under the category, 'How-to and DIY'. Indeed ! Make your own monads, in 9 minutes or less !