Monday, July 20, 2015

The number-crunchers. How we learned to stop worrying and love to code.

My grandmother was a calculator, and I don’t mean to say I’m the newest from Texas Instruments. I mean my grandmother did calculations for a living, with pencil on paper, using a slide rule and logarithmic tables. She calculated the positions of stars on the night sky, for five minute intervals, day by day, digit by digit.

Today you can download one of a dozen free apps to display the night sky for any position on Earth, any time, any day. Not that you actually need to know stellar constellations to find True North. Using satellite signals, your phones can now tell your position to within a few meters, and so can 2 million hackers in Russia.

My daughters meanwhile are thoroughly confused as to what a phone is, since we use the phone to take photos but make calls on the computer. For my four-year old a “phone” is pretty much anything that beeps, including the microwave, which for all I know by next year might start taking photos of dinner and upload them to facebook. And the landline. Now that you say it. Somebody called in March and left a voicemail.

Jack Myers dubbed us the “gap generation,” the last generation to remember the time before the internet. Myers is a self-described “media ecologist” which makes you think he’d have heard of search engine optimization. Unfortunately, when queried “gap generation” it takes Google 0.31 seconds to helpfully bring up 268,000,000 hits for “generation gap.” But it’s okay. I too recall life without Google, when “viral” meant getting a thermometer stuffed between your lips rather than being on everybody’s lips.

I wrote my first email in 1995 with a shell script called “mail” when the internet was chats and animated gifs. Back then, searching a journal article meant finding a ladder and blowing dust off thick volumes with yellowish pages. There were no keyword tags or trackbacks; I looked for articles by randomly browsing through journals. If I had an integral to calculate, there were Gradshteyn and Ryzhik’s tables, or Abramovitz and Stegun's Handbook of Special Functions, and else, good luck.

Our first computer software for mathematical calculations, one of the early Maple versions, left me skeptical. It had an infamous error in one of the binomial equations that didn’t exactly instill trust. The program was slow and stalled the machine for which everybody hated me because my desk computer was also the institute’s main server (which I didn’t know until I turned it off, but then learned very quickly). I taught myself fortran and perl and java script and later some c++, and complained it wasn’t how I had imagined being a theoretical physicist. I had envisioned myself thinking deep thoughts about the fundamental structure of reality, not chasing after missing parentheses.

It turned out much of my masters thesis came down to calculating a nasty integral that wasn’t tractable numerically, by computer software, because it was divergent. And while I was juggling generalized hypergeometric functions and Hermite polynomials, I became increasingly philosophic about what exactly it meant to “solve an integral.”

We say an integral is solved if we can write it down as a composition of known functions. But this selection of functions, even the polynomials, are arbitrary choices. Why not take the supposedly unsolvable integral, use it to define a function and be done with it? Why are some functions solutions and others aren’t? We prefer particular functions because their behaviors are well understood. But that again is a matter of how much they are used and studied. Isn’t it in the end all a matter of habit and convention?

After two years I managed to renormalize the damned integral and was left with an expression containing incomplete Gamma functions, which are themselves defined by yet other integrals. The best thing I knew to do with this was to derive some asymptotic limits and then plot the full expression. Had there been any way to do this calculation numerically all along, I’d happily have done it, saved two years of time, and gotten the exact same result and insight. Or would I? I doubt the paper would even have gotten published.

Twenty years ago, I like most physicists considered numerical results inferior to analytical, pen-on-paper, derivations. But this attitude has changed, changed so slowly I almost didn’t notice it changing. Today numerical studies are still often considered suspicious, fighting a prejudice of undocumented error. But it has become accepted practice to publish results merely in forms of graphs, figures, and tables, videos even, for (systems of) differential equations that aren’t analytically tractable. Especially in General Relativity, where differential equations tend to be coupled, non-linear, and with coordinate-dependent coefficients – ie as nasty as it gets – analytic solutions are the exception not the norm.

Numerical results are still less convincing, but not so much because of a romantic yearning for deep insights. They are less convincing primarily because we lack shared standards for coding, whereas we all know the standards of analytical calculation. We use the same functions and the same symbols (well, mostly), whereas deciphering somebody else’s code requires as much psychoanalysis as patience. For now. But imagine you could check code with the same ease you check algebraic manipulation. Would you ditch analytical calculations over purely numerical ones, given the broader applicability of the latter? How would insights obtained by one method be of any less value than those obtained by the other?

The increase of computing power has generated entirely new fields of physics by allowing calculations that previously just weren’t feasible. Turbulence in plasma, supernovae explosion, heavy ion collisions, neutron star mergers, or lattice qcd to study the strong nuclear interaction, these are all examples of investigations that have flourished only with the increase in processing speed and memory. Such disciplines tend to develop their own, unique and very specialized nomenclature and procedures that are difficult if not impossible to evaluate for outsiders.
Lattice QCD. Artist’s impression.

Then there is big data that needs to be handled. May that be LHC collisions or temperature fluctuations in the CMB or global fits of neutrino experiments, this isn’t data any human can deal with by pen on paper. In these areas too, subdisciplines have sprung up, dedicated to data manipulation and -handling. Postdocs specialized in numerical methods are high in demand. But even though essential to physics, they face the prejudice of somehow being “merely calculators.”

The maybe best example is miniscule corrections to probabilities of scattering events, like those taking place at the LHC. Calculating these next-to-next-to-next-to-leading-order contributions is an art as much as a science; it is a small subfield of high energy physics that requires summing up thousands or millions of Feynman diagrams. While there are many software packages available, few physicists know all the intricacies and command all the techniques; those who do often develop software along with their research. They are perfecting calculations, aiming for the tiniest increase in precision much like pole jumpers perfect their every motion aiming after the tiniest increase in height. It is a highly specialized skill, presently at the edge of theoretical physics. But while we admire the relentless perfection of professional athletes, we disregard the single-minded focus of the number-crunchers. What can we learn from it? What insight can be gained from moving the bar an inch higher?

What insight do you gain from calculating the positions of stars on the night sky, you could have asked my grandmother. She was the youngest of seven siblings, her father died in the first world war. Her only brother and husband were drafted for the second world war and feeding the family was left to the sisters. To avoid manufacturing weapons for a regime she detested, she took on a position in an observatory, calculating the positions of stars. This appointment came to a sudden stop when her husband was badly injured and she was called to his side at the war front to watch him die, or so she assumed. Against all expectations, my grandfather recovered from his skull fractures. He didn’t have to return to the front and my grandma didn’t return to her job. It was only when the war was over that she learned her calculations were to help the soldiers target bombs, knowledge that would haunt her still 60 years later.

What insight do we gain from this? Precision is the hallmark of science, and for much of our society science is an end for other means. But can mere calculation ever lead to true progress? Surely not with the computer codes we use today, which execute operations but do not look for simplified underlying principles, which is necessary to advance understanding. It is this lacking search for new theories that leaves physicists cynical about the value of computation. And yet some time in the future we might have computer programs doing exactly this, looking for underlying mathematical laws better suited than existing ones to match observation. Will physicists one day be replaced by software? Can natural law be extracted by computers from data? If you handed all the LHC output to an artificial intelligence, could it spit out the standard model?

In an influential 2008 essay “The End of Theory,” Chris Anderson argued that indeed computers will one day make human contributions to theoretical explanations unnecessary:
“The reason physics has drifted into theoretical speculation about n-dimensional grand unified models over the past few decades (the "beautiful story" phase of a discipline starved of data) is that we don't know how to run the experiments that would falsify the hypotheses — the energies are too high, the accelerators too expensive, and so on[...]

The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.”
His future vision was widely criticized by physicists, me included, but I’ve had a change of mind. Much of the criticism Anderson took was due to vanity. We like to believe the world will fall into pieces without our genius and don’t want to be replaced by software. But I don’t think there’s anything special about the human brain that an artificial intelligence couldn’t do, in principle. And I don’t care very much who or what delivers insights, as long as they come by. In the end it comes down to trust.

If a computer came up with just the right string theory vacuum to explain the standard model and offered you the explanation that the world is made of strings to within an exactly quantified precision, what difference would it make whether the headline was made by a machine rather than a human? Wouldn’t you gain the exact same insight? Yes, we would still need humans to initiate the search, someone to write a code that will deliver to our purposes. And chances are we would celebrate the human, rather than the machine. But the rest is overcoming prejudice against “number crunching,” which has to be addressed by setting up reliable procedures that ensure a computer’s results are sound science. I’ll be happy if your AI delivers a theory of quantum gravity; bring it on.

My grandmother outlived her husband who died after multiple strokes. Well in her 90s she still had a habit of checking all the numbers on her receipts, bills, and account statements. Despite my conviction that artificial intelligences could replace physicists, I don’t think it’s likely to happen. The human brain is remarkable not so much for its sheer computing power, but for its efficiency, resilience, and durability. You show me any man-made machine that will still run after 90 years in permanent use.

22 comments:

  1. " I too recall life without Google, when “viral” meant getting a thermometer stuffed between your lips"

    Or elsewhere.

    ReplyDelete
  2. At which observatory did she work?

    ReplyDelete
  3. Myers is a self-described “media ecologist” which makes you think he’d have heard of search engine optimization. Unfortunately, when queried “gap generation” it takes Google 0.31 seconds to helpfully bring up 268,000,000 hits for “generation gap.”

    An even worse example (but not from media ecologists) is a band (actually a duo) called BOY. They pointed out in an interview that "Boy" is not googlable. Refining the search to "boy band" doesn't help much either. (Actually, the band BOY consists of two girls.)

    Ah yes, search-engine optimization. I remember spam emails claiming "we have helped to place thousands of sites in the top 10".

    ReplyDelete
  4. I believe it was Göttingen, at least that's where she lived later, but she moved a lot during the war times and I might confuse the story. (Will have to ask my mom.)

    ReplyDelete
  5. Yep, my mom confirms it was Göttingen.

    ReplyDelete
  6. "If a computer came up with just the right string theory vacuum to explain the standard model and offered you the explanation that the world is made of strings to within an exactly quantified precision, what difference would it make whether the headline was made by a machine rather than a human? Wouldn’t you gain the exact same insight?"

    It's the journey that is rewarding and not the actual destination. Imagine Ulysses travelling non stop from Troy to Ithaka :-)

    "As you set out for Ithaka
    hope the voyage is a long one,
    full of adventure, full of discovery.
    .
    .
    .
    .
    .

    And if you find her poor, Ithaka won’t have fooled you.
    Wise as you will have become, so full of experience,
    you will have understood by then what these Ithakas mean. "

    http://www.cavafy.com/poems/content.asp?cat=1&id=74

    ReplyDelete
  7. Elsewhere on the Internet I was having an argument in which one of the responses was that the (gravitational) N-body problem can't be solved by analytical functions but must be computed numerically. I had a similar thought to yours about the difference between analytical functions and numerically-calculated ones. I recalled that when I studied Trigonometry in high school I was given a pamphlet which contained tables of the sine, cosine, and tangent functions (and maybe a few others). We would look up values we needed in the tables and interpolate linearly to get intermediate values. In the computer-code implementations of sine and cosine that I have seen, there are polynomial curve fits for different regions of the curves, perhaps based on those same tabulated values in my pamphlet, which I think were calculated numerically (for the most part).

    So perhaps one could say that the simple, 2-degree, ordinary differential equation of the harmonic oscillator must also be calculated numerically.

    "Analytic function" means that the function's derivatives and integrals are known in terms of other analytic functions (or as a convergent power series), but I wonder, how do we know that we could not do that for the solutions of the N-body problem, given enough study of them? Isn't that how Bessel Functions and Trigonometric functions were developed and categorized? So I don't see a big difference between analytical functions and numerical calculations, except that we have studied the former more and know more about them.

    The other thing your essay reminded me of was that my first boss at General Electric, Doris Clarke, began her career as a "computer". In her day, computers were people who sat in front of a mechanical calculator (a sort of large, noisy typewriter with number keys and function keys, that did mainly additions, subtractions, multiplications, and (slowly and reluctantly) divisions), received numbers from person at a similar desk behind her, did further calculations with them based on the "program" that was running, and passed her results on to a next desk.

    ReplyDelete
  8. JimV: yes, this is exactly what I mean!

    ReplyDelete
  9. 1) HyperChem Lite, costing less than a textbook, mm+ calculates exact 3-D molecular structures in seconds. Terrifically wrong structures for carbonyls and methylidenes can occur. Verify hardware and software with wetware.

    2) "If a computer came up with just the right string theory vacuum to explain the standard model and offered you the explanation that the world is made of strings to within an exactly quantified precision" physics and chemistry should still challenge exact vacuum mirror symmetry toward hadronic matter with an extreme chiral probe. Euclid is one of Thurston's eight primary geometries of 3-space. The other seven matter.

    ReplyDelete
  10. Even more strange is when mathematics itself requires the use of the computer to validate a theorem like the four-color theorem, first of its kind. Such a solution has widely divided the community of mathematicians, is that still the mathematics when the mathematical demonstration itself is not verifiable by a human? The validation moves from the verification of theorem to the validation of the verification software.

    As for thinking machines, in 2011 when IBM Watson became Jeopardy world champion I realized it was the end of the domination of man for thought. Though I know the theory, I never thought to see it in my lifetime. As soon as the machine can read and do math, good luck humanity ; how 1 KHz can compete with GHz?

    ReplyDelete
  11. Isn't there a more fundamental difference, in term of "complexity class" or "hardness", between problems (e.g. differential equations) having analytical solutions vs. those having none? For example, if the solution can be expressed as y = f(x) with f a polynomial function, we can evaluate it at any point with finite accuracy in polynomial time. Would this be true if we had to perform a numerical integration of the equation instead (requiring the same accuracy in the solution)?

    The concept of algorithmic complexity does seem to differentiate the two cases, since in both cases the unknown number is expressed as a very compact string of character (the equation itself or its analytical solution). Is there any other relevant measure of complexity applicable here ?

    ReplyDelete
  12. Chris, you not have to go very far in algebra. Matiyasevich theorem (1970) negatively solves the Hilbert's tenth problem.

    ReplyDelete
  13. You need to be able to prove error bounds on your numerical computation, which is presumably well-understood in the case of standard functions.

    I'm assuming that the existence proof (that the computation is meaningful) already exists.

    Numerical computations slso need to give insight into limiting cases, asymptotic forms, and how input parameters or boundary conditions change the output.

    Also, how would you arrive at the notion of the renormalization group from purely numerical calculations?

    ReplyDelete
  14. Facial recognition is something your eyes & brain do automagically. But trying to replicate that with computer and camera requires both computation and analytic understanding. E.g., Eigenfaces

    ReplyDelete
  15. Slightly off-topic, but not entirely, the movie Ex Machina is highly recommended for artistic elegance and thought-provoking subject matter.

    ReplyDelete
  16. Quite an interesting story about your grandmother. She would have been about the same generation as my parents. I well remember our dad meticulously balancing his checkbook by hand, back in the 1950s. Even attending High School between 62-65 we did all calculations by hand. When my older brother bought a pocket calculator in the early 70s, for 200 dollars, it was an absolute marvel.

    I just finished reading (online) the very nicely written article in Scientific American "A Geometric Theory of Everything" by A. Garrett Lisi and James Owen Weatherall, that covers the evolution of the Standard Model to its present plateau - SU(3), SU(2), U(1). This is followed by an elucidation of how the E(8) Lie group is utilized to embrace both the Standard Model and gravity into a single, unified geometric structure.

    The article just skims the surface of how Lie groups apply to the physics of particles and fields. But the full mathematical complexity of this field of endeavor is quite mind boggling, and I could well imagine the need for A1 supercomputers to assist, or even completely take over, the model building process.

    ReplyDelete
  17. "Correlation supersedes causation"
    someone needs to tell all those machine learning researchers working on causal graphical models they are wasting their time - or, perhaps, now that Causation has become a going concern within wheel houses proximate to Chris Andersen's own, it isn't such a useless thing to study after all. I'm not trying to put him down for it- deciding subjects which don't put food on your own plate are uninteresting is the mark of a professional. A working scientist. A survivor.

    We should take that into account when human creatures tell us what is and isn't interesting. They might just be saying, there's no low hanging fruit on that tree for me to eat. Doesn't mean the view from the top of that tree isn't important.

    The most obvious subject for Machine Learning to eat would be experiment.

    ReplyDelete
  18. I read a story years ago (unfortunately I cannot remember the title) that imagined a future where numerical solving programs grew in power to the point where they answered fundamental questions. However, just like Chris Anderson's quote above "... and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all.", the numerical results based upon numerical results based upon... led to answers without any understanding or explanation for where it came from. And worse, there were further results appearing to be answers to questions that no one even knew to ask. It was an interesting cautionary tale about tools outpacing understanding and the interplay between answering a question and understanding a question and understanding an answer.

    ReplyDelete
  19. Believe it or not I wrote this comment before I read Arun's comment, it's purely coincidental . . .

    I thought this issue was settled long ago!?! Numerical methods have really been indispensable to science since the 70's with the parallel advent of more efficient computation and dynamical chaos. How did Feigenbaum discover Universality in phase transitions? And in my opinion it was Feigenbaum's discovery which put the renormalization group on a solid foundation, at least philosophically. And even when Lanford, a pure mathematician, developed a proof, which the community deemed rigorous, the proof depended to a large degree on numerical methods.



    I think people who express disdain for numerical methods are just inherently dishonest, with themselves and others. All knowledge is provisional in that it rests on a foundation of induction and numerical methods bring this to the forefront. And that, to me, is a good thing; it dispels dogma! As the early chaos pioneers liked to say, numerical methods develop intuition. Dogmatists quite often see causation where only correlation exists anyway!



    This is really what made me wonder if perhaps James Gates hadn't discovered the reason why Universality appears with his adinkras. If you're not familiar with Gates, he works with SUSY and his adinkras are Feynman diagram analogs which represent oftentimes complicated systems of super-differential equations. To evolve the system you fold the adinkra but this folding process can be quite complex and if you're not careful you can lose SUSY. So what Gates did was assign each node in the adinkra a binary word and he discovered, quite by "accident," that the folding process which maintains SUSY conforms to one of Hamming's error-correction codes! So perhaps one sees Universality in phase transitions due to some error-correction process?



    http://arxiv.org/pdf/hep-th/0408004v1.pdf



    http://www.bottomlayer.com/PWJun10gates.pdf



    The last link is to an article which appeared in Physics World, 2014.

    ReplyDelete
  20. This is the evolution, which is not difficult to predict (for example here https://www.reddit.com/r/Physics_AWT/comments/2htmk5/science_graduates_are_not_that_hot_at_maths_but) It's the increasing complexity of formal models and decreasing cost of computer time, which will force the physicists to orient itself to numeric calculations and even simulations, despite some of them are still proud of their analytical skills by now.

    ReplyDelete
  21. Here's another weird coincidence! I wrote this comment before reading Zephir's:

    You know it was this whole train of thought which led me to the idea of bi-simulation on non-well-founded sets - which I have unsuccessfully tried to convey to you and, through you, to Renate Loll. So, I'll express it here and then drop it forever!

    As so eloquently expressed in Smolin's Three Roads to Quantum Gravity, the high degree of fine-tuning we witness defies probabilities; this, to me, strongly suggests retro-causation or, in other words, a distinct final condition. So why couldn't you use the adinkras of James Gates to develop a bi-simular model? I don't see why you couldn't because essentially adinkras are analogs to graphs and the folding process establishes relations between nodes:

    http://www.cs.indiana.edu/cmcs/bisimulation.pdf

    So, you develop one adinkra which evolves from a distinct initial condition "forward" in time and another bi-simular adinkra which evolves from a distinct final condition "backward" in time. These adinkras are not symmetrical, rather, they simulate one another, hence, bi-simulation. At forward time step t = a the adinkra folding "forward" in time may have gone through y folds while the adinkra folding "backward" in time may have gone through xy folds but both processes result in the same system state at forward time step t = a. They simulate one another. Would this not put an interesting constraint on the initial and final conditions? And what if what we think of as initial and final conditions are in actuality phase transitions? Could such a model perchance be illuminating?

    Some scientists say non-well-founded sets are incompatible with quantum theory but Ben Goertzel, an expert on non-well-founded sets, dispenses with that myth in chapter seven of his book Chaotic Logic.

    In case you missed the subtleties in my last comment, I'm not suggesting that the error-correction takes place in the world we perceive, what you call Minkowski space and Will Tiller calls D-space, rather, the error-correction occurs in Will Tiller's R-space. It occurs in the "electron-clock" described by the Zitter Model of David Hestenes and it's super-luminal. This is why the world we perceive appears coherent and consistent.

    ReplyDelete
  22. We've been well past the era of closed form solutions for some time. I remember some popular physics book, I think by Steven Weinberg, saying that we can tell the sophistication of a gravitational theory by the N at which the N-body problem has no closed form solution. Newton's fails at N=3, Einstein's at N=2, and many popular quantum gravity theories fail at N=1 or even N=0.

    I think a lot of the discomfort is like the Pythagorean discomfort with an irrational square root of two or that in jimv's discussion of the awkwardness of sine and cosine not having simple finite means of evaluation. (BTW, computers use Chebyshev series which start with the Taylor expansion and optimize the coefficients for evaluation across a a small region e.g. from 0 to pi/4). We've grown comfortable with things like sine and cosine and gamma and Bessel functions. We haven't grown comfortable with all the functions that we construct when we solve numerical problems. The problem is cultural. I remember a Czech scientist arguing with me that Monte Carlo integration wasn't real integration, it was American integration. Perhaps he was right.

    Back in the 1980s, the computer scientist Gerald Sussman proposed that in the future physics would be done using high powered computers as part of his dynamacist's workbench project. He and Jack Wisdom used a prototype machine to demonstrate that the orbit of Pluto was chaotic. There was some discomfort with their results, but it got swept under the rug of chaos theory.

    I remember when the Four Color Theorem was proved in the 1970s. A lot of people were uncomfortable with the fact that a computer had to validate the hundreds of graph configurations. Since the 1980s mathematicians have gotten more and more comfortable using computers for mathematical exploration. It took a month of computing time to prove the Four Color Theorem on a high powered mainframe in the 70s, but now you can run the solution on your desktop in a minute or less.

    There is nothing like a computer for banging numbers together at high energies. Insights come from strange observations. For example, some mathematician noticed that an algorithm for calculating the digits of pi seemed to converge more rapidly every few hundred digits rather than more evenly. This led to a formula for the N-th digit of pi, in base 16, but it was a formula that no one had suspected even existed.

    I think that in another generation or two physicists are going to think no more of numerical solutions as opposed to algebraic solutions than we do of sines and cosines or negative numbers, but I don't think humans are going to be taken out of the loop. Computers are called thinking machines not because they can think for themselves, but because they let us think better, faster and farther.

    ReplyDelete

COMMENTS ON THIS BLOG ARE PERMANENTLY CLOSED. You can join the discussion on Patreon.

Note: Only a member of this blog may post a comment.