Comments on The Geomblog: Important heuristics: Alternating optimization

It's a special case of coordinate descent, yes...

2011-10-19T13:21:14.125-06:00

It's a special case of coordinate descent, yes.

isn't this coordinate ascent/descent?

2011-07-11T19:55:32.726-06:00

isn't this coordinate ascent/descent?

Hi,interesting post. Does anybody have a PDF/PS fi...

2008-08-09T11:17:00.000-06:00

Hi,

interesting post.

Does anybody have a PDF/PS file of the Csiszar/Tusnady paper?

Thanks,
Thomas

Hmm. interesting question. I actually don't know w...

2008-01-29T20:54:00.000-07:00

Hmm. interesting question. I actually don't know what one might do for general log likelihood maximization: it would be interesting to see if the k-means type analyses actually extend.

I am working in a applied ML field (speech technol...

2008-01-29T07:07:00.000-07:00

I am working in a applied ML field (speech technology) and we use k-means and EM daily. So the topic of the post is really relevant to me.

Now that we are talking about EM and the friends. The problem I have been wondering for a some time is what is the complexity of the underlying problem in EM (i.e. loglikelihood maximization)?

The case of K-means, as Suresh wrote is a discrete one, we can always just consider the assignments of data vectors to different partitions. So we got a nice combinatorial problem, but in the case of loglikehood maximization no such discrete structure exits. Even if we restrict the mean vectors of the GMM to be subset of the data vectors, we still need to somehow select component weights and covariance matrices.

Do we need to use Blums Complexity theory (from Complexity and Real Computation) to analyze this problem?

Interesting. this seems to be closely related to a...

2008-01-28T21:36:00.000-07:00

Interesting. this seems to be closely related to a new paper by Rifkin et al on value regularization.

http://www.jmlr.org/papers/volume8/rifkin07a/rifkin07a.pdf

A interesting generalization of EM is the CCCP (co...

2008-01-28T17:57:00.000-07:00

A interesting generalization of EM is the CCCP (convex-concave procedure, not the country), it's been used to get convergent message-passing algorithms for loopy graph decoding.

I didn't. I think of EM and k-means as essentially...

2008-01-28T15:48:00.000-07:00

I didn't. I think of EM and k-means as essentially the same process (k-means is the discrete analog of EM, which itself is a special case of Blahut-Arimoto, with a distance function defined by the induced Bregman distance)

In coding theory, this approach is called the Blah...

2008-01-28T09:03:00.000-07:00

In coding theory, this approach is called the Blahut-Arimoto algorithm; they apply it for (numerically) finding channel capacity of certain channels.

As another important alternating algorithm for loc...

2008-01-28T08:30:00.000-07:00

As another important alternating algorithm for local search, let's not forget the EM algorithm in machine learning.