Lawns in public places all suffer from the same problem: people don’t like detours. In cities throughout the world people search for the fastest route to the workplace, the shortest way to the restroom, the least pricey airline, the most convenient parking spot - depending on resources and personal preferences we optimize our day with regard to criteria we regard important. Cooks experiment with recipes to create the most delicious meals, politicians argue about taxation to score in polls, you aim to find the most comfortable position on your couch. In most cases, these are optimizations through incremental modifications and evaluation of the change, little steps of trial and learning, and eventual selection of the optimal solution.
Not only our daily lives reflect our aim to optimize under variation, but Nature itself shows the

selection of optimal configurations. A soap bubble minimizes surface area [1]. Electric currents prefer the way of least resistance, water runs downhills around obstacles in its way.
In all cases we have a system with a quantity which is optimized for one of many possible configurations, and the configuration optimal in this regard is the one realized in Nature. Optimization can mean either lowering a quantity to a minimal value, or obtaining a maximal value. Might that be you slouching on the couch with your feet on the table because it’s the most comfortable way to spend your evening, or dozens of students trampling their traces in the campus’ lawn because it’s the fastest way to coffee.
The same idea underlies theoretical physics. For every system we want to describe we have a quantity whose value has to be optimized. The way we find the optimal configuration is to make small changes and to take the configuration which would get less optimal under any change. This is essentially the same procedure one does for finding the extrema of a function by requiring the first derivative to vanish: These small changes are called ‘variations’, are denoted with a small delta δ, and the process is called the ‘variational principle’. For the optimal configuration, the variation has to vanish. In physics, in most cases the quantity optimized is called the ‘action’, and is usually denoted with a capital
S. The requirement then reads

The Best of all Possible Worlds One can’t write about the variational principle without some name dropping. The first to drop is
Gottfried Wilhelm Leibniz, German scientist, philosopher, mathematician, and the one with
the cookie. Leibniz already in 1710 wrote the “Essais de Théodicée sur la bonté de Dieu, la liberté de l'homme et l'origine du mal ” (Engl: Essays of theodicy on the goodness of God, the freedom of man and the origin of evil, writing in French was apparently chique in the 18th century), in which he develops the thought that our world is the ‘best of all possible worlds’. Though his argumentation is heavily theological, he made fairly clear that with the ‘best’ world he meant one that is optimal in some sense, possibly in a sense not immediately obvious for us, and that the optimal world has to be among the indeed possible worlds. Certainly, we can imagine a world with less evil, less hunger, less poverty and less spam in my inbox, but is this a possible world? The question touches on the trouble with all utopias that build upon idealized caricatures of men, and remain wishful thinking because they fail to describe reality.
Voltaire famously made fun of the idea the world could as it is be optimal in his satire “
Candide, ou l'Optimisme,” in which he attacks the Leibnizian optimism. This always seemed odd to me since I find Leibniz' conviction the world we live in is the best possible one rather pessimistic.
Leibnitz however inspired
Maupertuis, a French mathematician and philosopher, to put the linguistic argument on a more rigorous base. Maupertius came up with the concept that light travels on the shortest path, known as
Maupertius' principle, which was a successor of Fermat's principle and the predecessor to the principle of least action. The calculus of the general variational principle was only shortly after this developed by
Leonard Euler and
Joseph-Louis Lagrange. Both of their names are today intimately connected with the formalism: the equations of motions one derives from the variation are known as the ‘Euler-Lagrange equations’, and the action is an integral over a function called the ‘Lagrangian’ (though certain people insist on calling it ‘Lagrangean’).
ExamplesThe Lagrangian for Newtonian mechanics for example is

A nicer example is however the Lagrangian for a point particle of mass
m in a possibly curved background described by the metric tensor
gμν, where the action is just the length of the curve

The equation of motion is then the so-called geodesic equation, the shortest curve in an arbitrary background, which is not usually – as in flat space, e.g. a lawn – just a straight line:

The routes of airplanes for example are to a good approximation geodesics of a sphere (neglecting wind and countries who have issues with their airspace). Though it might not look like it on the routemap you find in the in-flight magazine, these are indeed the shortest connections possible.
For fields instead of single particles, the Lagrangian is a function over space-time and the action is a volume integral. The Lagrangian of General Relativity for example is just the curvature scalar R, the variation yields Einstein’s Field equations. The Lagrangian of free Electrodynamics is F2, where F is the field-strength tensor, the variation yields the free Maxwell equations. One can couple these to matter in a straight-forward way to also get the source terms.
One thing that is important to notice here is that these variations are not actual changes of the real system. Unlike you trying several non-optimal ways to your new workplace before you find one that you like best, here the variation is not performed in reality. It is a variation in the space of could-be configurations, a procedure to find the one realized. The world wasn’t created crappy and then became better, it was just optimal all along. Systems solve the equations one obtains from the variational principle, they don’t learn how to do so as time goes by.
So, now that we've talked about the ‘best’ world, let us replace ‘best’ with ‘fittest’...
Cosmological Natural Selection
Cosmological Natural Selection (CNS) is Lee Smolin’s suggestion for a testable alternative to the anthropic principle (see hep-th/0612185). You find my thoughts on the anthropic principle here, in brief: The (weak) anthropic principle is a nothing but a constraint on the parameters of our theories. It is trivially correct our universe allows the existence of life (we can discuss whether the life we know is intelligent or not), and thus whatever model we use better were not in disagreement with that. This can indeed provide constraints on the parameters of your theory. However, if your theory can’t accommodate the requirement you’d not throw out the assumption life is possible in our universe, but rather your theory.
The most severe problem with the anthropic principle is that a mathematically useful definition of life is absent, and the presence of life is not something that anybody has yet managed to quantify. Thus, it is not a scientific argument, but a rhetoric one and it’s for practical purposes useless. However, as Lee points out correctly (in his paper and also in his recent talk at the Multiverse conference, see PIRSA 08090050) anthropic arguments typically have nothing to do with life in the first place, but with some more general preconditions such as formation of galaxies or existence of carbon molecules, which can then indeed be scientifically evaluated.
CNS tackles the question why our universe is as it is essentially by suggesting a quantity that is optimized for our universe with the parameters that we observe. The specific quantity in this case is the number of black holes and Lee’s argument goes that the number of black holes would drop whenever one turns the parameters of our theories - one can apparently study a lot of different scenarios where that holds. However, as it stands this quantity, the number of black holes, unfortunately is also ill-defined. The obvious questions lying at hand are: If N black holes merge does that count as one or as N? What if a black hole evaporates? Do virtual black holes count?
But forget for a moment the number of black holes specifically and take any quantifiable function of the parameters in the Standard Model and ΛCDM. Then the idea is simply our universe with the parameters as we have measured them optimizes this quantity. Turning the parameters to see whether the number increases or decreases is a poor man’s way to finding a maximum, so pretty much a variational principle.

That doesn’t quite explain though why that scenario is called something-with-natural-selection. The reason for this I suspect is a severe case of Santafeism, and that’s where the idea stops making sense to me. See, the black holes, they supposedly don’t have singularities inside but form new ‘baby universes’ with slightly modified parameters, and in such a way the ‘fittest’ universe, i.e. the one with most black hole offspring, is the one we’d likely find ourselves in. That baby-universe story sounds cute, but is as far as I am concerned wishful thinking.
But again, forget for a moment also the story with the baby universes.
What remains is the idea that we live in the “best of all possible worlds” in some sense. Might that be the “best” world to produce black holes or something else. The question here is simply whether there exists a quantity that is optimized for exactly the parameters of the Standard Model + ΛCDM that we observe in our universe. It lies at hand to think of complexity as a possible alternative, but here again one runs into the problem that it is not a well-defined quantity (what’s the complexity of planet Earth?) and thus useless for practical purposes. It remains the question however, whether the non-optimal worlds “really” exists.
I’m not a big fan of CNS because of the problems mentioned above (and some others), but I like the general idea that the function to be optimized might be a macroscopic quantity that is not easily derivable from the fundamental laws.
A Principle of Everything?
The variational principle has proved to be enormously useful and successful. In addition to that it is also a compelling, simple and elegant formulation. It has everything a theoretical physicist desires. Nevertheless, I can’t but occasionally wonder what if this principle does not indeed hold for the yet to be discovered unified fundamental laws that govern our universe?
Neither Einstein nor Maxwell initially formulated their theories starting from a Lagrangian, they started with the field equations. Yet at some point during the last century, it has become a standard procedure to start from the Lagrangian [2], which reduces the space of possible theories as there are indeed equations of motions that do not follow from any Lagrangian. Given that the examples I know are not examples I’d consider particularly interesting, this might not be a big loss. But the existence of a Lagrangian is nevertheless an implicit assumption that not typically is much discussed.
Sensemaking
The principle of least action appeared on our curriculum in my second year at College, and it has to me always been the most beautiful explanation of the world around us. Not only is the idea of optimization compelling, but the same principle can be used for completely different systems, and for different theories. The only thing one needs to change is the function to be optimized. With that function, you perform the variation according to a well-defined mathematical procedure and get the equations of motions. What a relieve that was to eventually have a clear procedure to arrive at the relevant equations after we had spend years in physics assembling equations on a case-by-case basis! Textbooks frequently offered explanations that only made sense if you already knew the result, it typically involved a lot of guessing and hand-waving, or knowing where to find the solution to the exercise. Now suddenly that all made sense.
More about the history and applications of the variational principle in this nicely illustrated book:
The Parsimonious Universe: Shape and Form in the Natural World
By Stefan Hildebrandt and Anthony Tromba[1] An optimization that heavily inspired Frei Otto's architecture, who for example designed the Olympia Stadium in Munich. See e.g. The lightweight champion of the world - How soap bubbles and cobwebs helped Frei Otto win architecture's greatest prize, by Jonathan Glancey.
[2] Rspt. some decades later the path integral.
TAGS: PHYSICS, VARIATIONAL PRINCIPLE, COSMOLOGICAL NATURAL SELECTION, BEST OF ALL POSSIBLE WORLDS, LEIBNIZ COOKIE
Lee Smolin (2006). The status of cosmological natural selection arXiv