I dislike them greatly though (even though in my misguided youth I used to play with GAs) because they are extremely general hammers that typically don't have provably performance or running time guarantees, and don't reveal any interesting insights into the structure of a problem.

Such search strategies are also victims of their own generality, and the dagger that pierces them is called a "No Free Lunch" theorem. No Free Lunch theorems are generally of the form

...all algorithms that search for an extremum of a cost function perform exactly the same, when averaged over all possible cost functions. In particular, if algorithm A outperforms algorithm B on some cost functions, then loosely speaking there must exist exactly as many other functions where B outperforms A.The crucial point here is the "averaged over all cost functions". The NFL theorems are not saying (and this was something I used to be confused about) that search procedures are doomed; rather, they say that blind search procedures (such as those performed by genetic algorithms) are doomed. The original NFL theorem is attributed to Wolpert and Macready; many variants have been proved as well.

Joseph Culberson has a nice perspective on such theorems from an algorithmic point of view, and attempts to frame them in the context of complexity theory.

Tags: genetic, algorithms

howdy

ReplyDeleteWe work like a horse.

We eat like a pig.

We like to play chicken.

You can get someone's goat.

We can be as slippery as a snake.

We get dog tired.

We can be as quiet as a mouse.

We can be as quick as a cat.

Some of us are as strong as an ox.

People try to buffalo others.

Some are as ugly as a toad.

We can be as gentle as a lamb.

Sometimes we are as happy as a lark.

Some of us drink like a fish.

We can be as proud as a peacock.

A few of us are as hairy as a gorilla.

You can get a frog in your throat.

We can be a lone wolf.

But I'm having a whale of a time!

You have a riveting web logand undoubtedly must have

atypical & quiescent potential

for your intended readership.

May I suggest that you do

everything in your power to

honor your encyclopedic/omniscient

Designer/Architect as well

as your revering audience.

As soon as we acknowledge

this Supreme Designer/Architect,

Who has erected the beauteous

fabric of the universe, our minds

must necessarily be ravished with

wonder at this infinate goodness,

wisdom and power.

Please remember to never

restrict anyone's opportunities

for ascertaining uninterrupted

existence for their quintessence.

There is a time for everything,a season for every activity

under heaven. A time to be

born and a time to die. A

time to plant and a time to

harvest. A time to kill and

a time to heal. A time to

tear down and a time to

rebuild. A time to cry and

a time to laugh. A time to

grieve and a time to dance.

A time to scatter stones

and a time to gather stones.

A time to embrace and a

time to turn away. A time to

search and a time to lose.

A time to keep and a time to

throw away. A time to tear

and a time to mend. A time

to be quiet and a time to

speak up. A time to love

and a time to hate. A time

for war and a time for peace.

Best wishes for continued ascendancy,

Dr. Howdy

'Thought & Humor'P.S. One thing of which I am sure isthat the common culture of my youth

is gone for good. It was hollowed out

by the rise of ethnic "identity politics,"

then splintered beyond hope of repair

by the emergence of the web-based

technologies that so maximized and

facilitated cultural choice as to make

the broad-based offerings of the old

mass media look bland and unchallenging

by comparison."

{Please note that this letter about your

esteemed site promotes no merchandise -

but is simply a missive of good will to you.}

////////////////////////

Posted by

Isn't "blind search" a bit of a misnomer (**)?

Alex Lopez-Ortiz

ReplyDeleteUsually the combination operators and fitness function in GAs are highly fine tuned to the specific problem being solved. While it is true that generally such fine tuning never explicitly involves the first derivative in some (many?) cases it implicitly encodes such information.

For example, for a specific (and rather successful) application that I have in mind, one of the combination operators was an exhaustive local search of the neighbourhood of the current solution, with the local maximum in this neighbourhood being the sole survivor (first derivative anyone?)

One of John Koza's students told me once that in their Stanford group they used to spend a month or two fine-tunning the combination operators and fitness function before unleashing the search. The harderst part was doing so without biasing the shape of the solution. In other words, one wants combination operators that preserve whatever is good about the current solution, rather than combination operators that will make the current solution look the way you believe good solutions ought to look like. [Koza's group has the distinction of having filed and obtained a patent for a device discovered by a GA.]

(**) Admittedly the "blind search" moniker was advanced by some over-enthusiastic GA researchers. In principle this would be a "nice" property of GAs, because if true it obviates the need to program a refined search algorithm.

Posted by

I guess what irks me is the claim to 'generality', and the use of these methods as hammers. The way I see it, this is s catch-22. Either you believe that GAs are general, and require no problem specific tuning, in which case the NFL theorems apply. If you don't, and believe that specific fine tuning of a GA is required to obtain local/global maxima, then why on earth wouldn't you spend the same time trying to understand the problem formally ?

Suresh

ReplyDeleteTSPs (and I should have linked the paper) are a good example of this. A GA-like algorithm is one of the best practical heuristics for performing TSPs, but this has come after a wealth of understanding of problem, and in fact even to establish the quality of the GA-based heuristic, we need a better formal understanding of TSP lower bounds.

Posted by

The point is that in some instances the formal study of the problem is more difficult than having the computer automatically search for a solution. A good example is the evaluation function of Deep Thought/Deep Blue. AI chess researchers had spent an enormous amount of time trying to understand the ins and outs of evaluation functions. In contrast, the Deep Blue team (which had relatively little knowledge of chess or AI) created a very good evaluation function using a GA search technique. (The function was later on further refined by hand under the efforts of Joel Benjamin).

Alex Lopez-Ortiz

ReplyDeleteAnother example is the design of an amplifier circuit. This is a well studied, yet quite difficult problem. Koza's team uncovered a new design that was in many ways superior to existing designs.

In both cases the parallel GA search time consumed several years of CPU time.

Posted by

"

Suresh

ReplyDeleteThe point is that in some instances the formal study of the problem is more difficult than having the computer automatically search for a solution."I think this is the main point. In other words, the GA search strategy is a "conjecture-generating device", used to generate ideas where humans have missed out.

The problem I have with this is that I can think of far more places where human creativity trumps what a computer has been trained to do. This is not a knock on AI per se. My point is merely that in all kinds of "pattern matching" tasks that require one to make creative leaps, it has been hard to get a computer to do it independently.

So to me, as a conjecture generating machine, the GA doesn't have a good cost-benefit ratio, and as an engineering heuristic, it leaves me, the algorithm designer, in the dark about what's really going on, with no ability to explain its behaviour, or even predict what kinds of problems might be amenable to this kind of approach (local search and PLS maybe?)

Posted by

The moral of the NFL story is that we do not live in a world where all functions are equally likely or equally interesting. No Free Lunch states that in the space of all possible functions, for every one where an algorithm works there will be one where it doesn't work. For instance, for every function with a gradient that leads to an optimum, there will be one where the gradient is deceptive, and leads away from the optimum. Thus a gradient descent algorithm will perform, on average, no better than a gradient-opposing one.

Passerby

ReplyDeleteThe argument relies on the presumption that we are talking about all possible functions. In reality, though, most possible functions would be considered "random" (a can of worms, yes, but appropriate here). I have a friend who likes to say that if you took the number of functions that have been studied by humans and divided it by the number of possible functions, the result would be, essentially, zero.

It seems to me that there is something different about the subset that we do work with. One clue is that most of those functions are encoded as formulas, rather than look-up tables. In other words, they are orderly, compressible, simple. A function that can only be described by a lookup table is a random one, by several definitions. Even when we are working with raw data, we are almost aways searching for the compressible form, the formula that summarizes the underlying function.

So ... the No Free Lunch theorems apply in a kind of dreamworld, the Land of Infinite Possibilities, and not in the world of real working mathematics, where systems cohere in an orderly way. In the years since their publication, it has seemed to me that the great controversy blew up, blew around a little bit, and then blew away. NFL is not a real challenge to anybody.

Posted by

I don't view the NFL theorems as negative results; more as warnings about merchants bearing snake oil. In that respect, they are useful.

Suresh

ReplyDeletePosted by

Alex Lopez-Ortiz

ReplyDeleteThe problem I have with this is that I can think of far more places where human creativity trumps what a computer has been trained to do. This is not a knock on AI per se.For any given problem, my money would be first on formal analysis yielding an efficient solution (that's why I'm a theoretician). If, on the other hand, we have good evidence that finding a solution analitically is very difficult and yet a good feasible solution is needed regardless, then I'd place my money on a computer search.

Depending on the problem, the search heuristic would be different. GA excels in problems where the search space is not isomorphic to a subset of R^n. The fact that non-standard combination operators are an integral part of the heuristic makes GA better suited for these types of problems.

For other problems, other heuristics are better.

I don't view the NFL theorems as negative results; more as warnings about merchants bearing snake oil. In that respect, they are useful.Sure, there were (are?) some over-enthusiastic proponents of GA, and the NFL theorems help to keep thinks in perspective.

Posted by

"because they are extremely general hammers that typically don't have provably performance or running time guarantees"

Yaroslav Bulatov

ReplyDeleteActually, neural network community has done quite a bit of interesting research into performance guarantees. For instance

- "P. L. Bartlett. For valid generalization, the size of the weights is more important than the size of the network", author gives bounds on misclassification rate using size of the weights.

- Work by the school lead by Watanabe studying how algebraic geometry of the set of best fitting models (which is more than 0 dimensional when the model is non-identifiable) relates to statistical efficiency of estimators, in particular neural network

- Work by the school of Amari studying differential geometry of neural networks, and also into statistical apects of neural network estimators.

- John Moody introduced in 1991 "Generalized Prediction Error" penalty which generalized AIC for non-linear models, in particular, neural networks

Neural networks are often used as non-parametric estimators, ie, "general hammers", but they can also be designed for a particular task. Check out Yann LeCun's website for some impressive applications. In particular http://www.cs.nyu.edu/~yann/research/dave/index.html

Posted by

hi there! enjoyed your post. free personals

ReplyDelete