tag:blogger.com,1999:blog-6555947.post1717136894276259408..comments2023-09-10T07:21:33.251-06:00Comments on The Geomblog: Important heuristics: Alternating optimizationSuresh Venkatasubramanianhttp://www.blogger.com/profile/15898357513326041822noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-6555947.post-20610402035334378202011-10-19T13:21:14.125-06:002011-10-19T13:21:14.125-06:00It's a special case of coordinate descent, yes...It's a special case of coordinate descent, yes.Suresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-19398852795984096882011-07-11T19:55:32.726-06:002011-07-11T19:55:32.726-06:00isn't this coordinate ascent/descent?isn't this coordinate ascent/descent?Timothy Weehttps://www.blogger.com/profile/14906745045127017671noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-23545787817676169652008-08-09T11:17:00.000-06:002008-08-09T11:17:00.000-06:00Hi,interesting post. Does anybody have a PDF/PS fi...Hi,<BR/><BR/>interesting post. <BR/><BR/>Does anybody have a PDF/PS file of the Csiszar/Tusnady paper?<BR/><BR/>Thanks,<BR/>Thomastdhttps://www.blogger.com/profile/14152142410891472095noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-19172025282247836992008-01-29T20:54:00.000-07:002008-01-29T20:54:00.000-07:00Hmm. interesting question. I actually don't know w...Hmm. interesting question. I actually don't know what one might do for general log likelihood maximization: it would be interesting to see if the k-means type analyses actually extend.Suresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-75919693256172742932008-01-29T07:07:00.000-07:002008-01-29T07:07:00.000-07:00I am working in a applied ML field (speech technol...I am working in a applied ML field (speech technology) and we use k-means and EM daily. So the topic of the post is really relevant to me. <BR/><BR/>Now that we are talking about EM and the friends. The problem I have been wondering for a some time is what is the complexity of the underlying problem in EM (i.e. loglikelihood maximization)?<BR/><BR/>The case of K-means, as Suresh wrote is a discrete one, we can always just consider the assignments of data vectors to different partitions. So we got a nice combinatorial problem, but in the case of loglikehood maximization no such discrete structure exits. Even if we restrict the mean vectors of the GMM to be subset of the data vectors, we still need to somehow select component weights and covariance matrices. <BR/><BR/>Do we need to use Blums Complexity theory (from Complexity and Real Computation) to analyze this problem?Unknownhttps://www.blogger.com/profile/00227991895773060923noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-35304414510792448062008-01-28T21:36:00.000-07:002008-01-28T21:36:00.000-07:00Interesting. this seems to be closely related to a...Interesting. this seems to be closely related to a new paper by Rifkin et al on value regularization.<BR/><BR/>http://www.jmlr.org/papers/volume8/rifkin07a/rifkin07a.pdfSuresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-65550992616394472292008-01-28T17:57:00.000-07:002008-01-28T17:57:00.000-07:00A interesting generalization of EM is the CCCP (co...A interesting generalization of EM is the <A HREF="http://www.stat.ucla.edu/~yuille/pubs/optimize_papers/cccp_nips01.pdf" REL="nofollow">CCCP </A>(convex-concave procedure, not the country), it's been used to get convergent message-passing algorithms for loopy graph decoding.Yaroslav Bulatovhttps://www.blogger.com/profile/06139256691290554110noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-86999215168707505232008-01-28T15:48:00.000-07:002008-01-28T15:48:00.000-07:00I didn't. I think of EM and k-means as essentially...I didn't. I think of EM and k-means as essentially the same process (k-means is the discrete analog of EM, which itself is a special case of Blahut-Arimoto, with a distance function defined by the induced Bregman distance)Suresh Venkatasubramanianhttps://www.blogger.com/profile/15898357513326041822noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-8947606832776719432008-01-28T09:03:00.000-07:002008-01-28T09:03:00.000-07:00In coding theory, this approach is called the Blah...In coding theory, this approach is called the Blahut-Arimoto algorithm; they apply it for (numerically) finding channel capacity of certain channels.Michael Mitzenmacherhttps://www.blogger.com/profile/02161161032642563814noreply@blogger.comtag:blogger.com,1999:blog-6555947.post-56743015958904835592008-01-28T08:30:00.000-07:002008-01-28T08:30:00.000-07:00As another important alternating algorithm for loc...As another important alternating algorithm for local search, let's not forget the <A HREF="http://en.wikipedia.org/wiki/Expectation-maximization_algorithm" REL="nofollow">EM algorithm</A> in machine learning.Anonymousnoreply@blogger.com