Wednesday, August 28, 2019

Solutions to the black hole information paradox

In the early 1970s, Stephen Hawking discovered that black holes can emit radiation. This radiation allows black holes to lose mass and, eventually, to entirely evaporate. This process seems to destroy all the information that is contained in the black hole and therefore contradicts what we know about the laws of nature. This contradiction is what we call the black hole information paradox.

After discovering this problem 40 years ago, Hawking spent the rest of his life trying to solve it. He passed away last year, but the problem is still alive and there is no resolution in sight.

Today, I want to tell you what solutions physicists have so-far proposed for the black hole information loss problem. If you want to know more about just what exactly is the problem, please read my previous blogpost.

There are hundreds of proposed solutions to the information loss problem, that I can’t possibly all list here. But I want to tell you about the five most plausible ones.

1. Remnants.

The calculation that Hawking did to obtain the properties of the black hole radiation makes use of general relativity. But we know that general relativity is only approximately correct. It eventually has to be replaced by a more fundamental theory, which is quantum gravity. The effects of quantum gravity are not relevant near the horizon of large black holes, which is why the approximation that Hawking made is good. But it breaks down eventually, when the black hole has shrunk to a very small size. Then, the space-time curvature at the horizon becomes very strong and quantum gravity must be taken into account.

Now, if quantum gravity becomes important, we really do not know what will happen because we don’t have a theory for quantum gravity. In particular we have no reason to think that the black hole will entirely evaporate to begin with. This opens the possibility that a small remainder is left behind which just sits there forever. Such a black hole remnant could keep all the information about what formed the black hole, and no contradiction results.

2. Information comes out very late.

Instead of just stopping to evaporate when quantum gravity becomes relevant, the black hole could also start to leak information in that final phase. Some estimates indicate that this leakage would take a very long time, which is why this solution is also known as a “quasi-stable remnant”. However, it is not entirely clear just how long it would take. After all, we don’t have a theory of quantum gravity. This second option removes the contradiction for the same reason as the first.

3. Information comes out early.

The first two scenarios are very conservative in that they postulate new effects will appear only when we know that our theories break down. A more speculative idea is that quantum gravity plays a much larger role near the horizon and the radiation carries information all along, it’s just that Hawking’s calculation doesn’t capture it.

Many physicists prefer this solution over the first two for the following reason. Black holes do not only have a temperature, they also have an entropy, called the Bekenstein-Hawking entropy. This entropy is proportional to the area of the black hole. It is often interpreted as counting the number of possible states that the black hole geometry can have in a theory of quantum gravity.

If that is so, then the entropy must shrink when the black hole shrinks and this is not the case for the remnant and the quasi-stable remnant.

So, if you want to interpret the black hole entropy in terms of microscopic states, then the information must begin to come out early, when the black hole is still large. This solution is supported by the idea that we live in a holographic universe, which is currently popular, especially among string theorists.

4. Information is just lost.

Black hole evaporation, it seems, is irreversible and that irreversibility is inconsistent with the dynamical law of quantum theory. But quantum theory does have its own irreversible process, which is the measurement. So, some physicists argue that we should just accept black hole evaporation is irreversible and destroys information, not unlike quantum measurements do. This option is not particularly popular because it is hard to include additional irreversible process into quantum theory without spoiling conservation laws.

5. Black holes don’t exist.

Finally, some physicists have tried to argue that black holes are never created in the first place in which case no information can get lost in them. To make this work, one has to find a way to prevent a distribution of matter from collapsing to a size that is below its Schwarzschild radius. But since the formation of a black hole horizon can happen at arbitrarily small matter densities, this requires that one invents some new physics which violates the equivalence principle, and that is the key principle underlying Einstein’s theory of general relativity. This option is a logical possibility, but for most physicists, it’s asking for too much.

Personally, I think that several of the proposed solutions are consistent, that includes option 1-3 above, and other proposals such as those by Horowitz and Maldacena, ‘t Hooft, or Maudlin. This means that this is a problem which just cannot be solved by relying on mathematics alone.

Unfortunately, we cannot experimentally test what is happening when black holes evaporate because the temperature of the radiation is much, much too small to be measurable for the astrophysical black holes we know of. And so, I suspect we will be arguing about this for a long, long time.

Friday, August 23, 2019

How do black holes destroy information and why is that a problem?

Today I want to pick up a question that many of you asked, which is how do black holes destroy information and why is that a problem?

I will not explain here what a black hole is or how we that know black holes exist, for this you can watch my earlier video. Let me instead get right to black hole information loss. To understand the problem, you first need to know the mathematics that we use for our theories in physics. These theories all have two ingredients.

First, there is something called the “state” of the system, that’s a complete description of whatever you want to make a prediction for. In a classical theory, that’s one which is not quantized, the state would be, for example, the positions and velocities of particles. To describe the state in a quantum theory, you would instead take the wave-functions.

The second ingredient to the current theories is a dynamical law, which is also often called an “evolution equation”. This has nothing to do with Darwinian evolution. Evolution here just means this is an equation which tells you how the state changes from one moment of time to the next. So, if I give you a state at any one time, you can use the evolution equation to compute the state at any other time.

The important thing is that all evolution equations that we know of are time-reversible. This means it never happens that two states that differ at an initial time will become identical states at a later time. If that was so, then at the later time, you wouldn’t know where you started from and that would not be reversible.

A confusion that I frequently encounter is that between time-reversibility and time-reversal invariance. These are not the same. Time reversible just means you can run a process backwards. Time reversal invariance on the other hand means, it will look the same if you run it backwards. In the following, I am talking about time-reversibility, not time-reversal invariance.

Now, all fundamental evolution equations in physics are time-reversible. But this time-reversibility is in many cases entirely theoretical because of entropy increase. If the entropy of a system increases, this means that it if you wanted to reverse the time-evolution you would have to arrange the initial state very, very precisely, more precisely than is humanly possible. Therefore, many processes which are time-reversible in principle are for all practical purposes irreversible.

Think of mixing dough. You’ll never be able to unmix it in practice. But if only you could arrange precisely enough the position of each single atom, you could very well unmix the dough. The same goes for burning a piece of paper. Irreversible in practice. But in principle, if you only knew precisely enough the details of the smoke and the ashes, you could reverse it.

The evolution equation of quantum mechanics is called the Schroedinger equation and it is just as time-reversible as the evolution equation of classical physics. Quantum mechanics, however, has an additional equation which describes the measurement process, and this equation is not time-reversible. The reason it’s not time-reversible is that you can have different states that, when measured, give you the same measurement outcome. So, if you only know the outcome of the measurement, you cannot tell what was the original state.

Let us come to black holes then. The defining property of a black hole is the horizon, which is a one-way surface. You can only get in, but never get out of a black hole. The horizon does not have substance, it’s really just the name for a location in space. Other than that it’s vacuum.

But quantum theory tells us that vacuum is not nothing. It is full of particle-antiparticle pairs that are constantly created and destroyed. And in general relativity, the notion of a particle itself depends on the observer, much like the passage of time does. For this reason, what looks like vacuum close by the horizon does not look like vacuum far away from the horizon. Which is just another way of saying that black holes emit radiation.

This effect was first derived by Stephen Hawking in the 1970s and the radiation is therefore called Hawking radiation. It’s really important to keep in mind that you get this result by using just the normal quantum theory of matter in the curved space-time of a black hole. You do not need a theory of quantum gravity to derive that black holes radiate.

For our purposes, the relevant property of the radiation is that it is completely thermal. It is entirely determined by the total mass, charge, and spin of the black hole. Besides that, it’s random.

Now, what happens when the black hole radiates is that it loses mass and shrinks. It shrinks until it’s entirely gone and the radiation is the only thing that is left. But if you only have the radiation, then all you know is the mass, change, and spin of the black hole. You have no idea what formed the black hole originally or what fell in later. Therefore, black hole evaporation is irreversible because many different initial states will result in the same final state. And this is before you have even made a measurement on the radiation.

Such an irreversible process does not fit together with any of the known evolution laws – and that’s the problem. If you combine gravity with quantum theory, it seems, you get a result that’s inconsistent with quantum theory.

As you have probably noticed, I didn’t say anything about information. That’s because really the reference to information in “black hole information loss” is entirely unnecessary and just causes confusion. The problem of black hole “information loss” really has nothing to do with just exactly what you mean by information. It’s just a term that loosely speaking says you can’t tell from the final state what was the exact initial state.

There have been many, many attempts to solve this problem. Literally thousands of papers have been written about this. I will tell you about the most promising solutions some other time, so stay tuned.

Thursday, August 22, 2019

You will probably not understand this

Hieroglyps. [Image: Wikipedia Commons.]

Two years ago, I gave a talk at the University of Toronto, at the institute for the history and philosophy of science. At the time, I didn’t think much about it. But in hindsight, it changed my life, at least my work-life.

I spoke about the topic of my first book. It’s a talk I have given dozens of times, and though I adapted my slides for the Toronto audience, there was nothing remarkable about it. The oddity was the format of the talk. I would speak for half an hour. After this, someone else would summarize the topic for 15 minutes. Then there would be 15 minutes discussion.

Fine, I said, sounds like fun.

A few weeks before my visit, I was contacted by a postdoc who said he’d be doing the summary. He asked for my slides, and further reading material, and if there was anything else he should know. I sent him references.

But when his turn came to speak, he did not, as I expected, summarize the argument I had delivered. Instead he reported what he had dug up about my philosophy of science, my attitude towards metaphysics, realism, and what I might mean with “explanation” or “theory” and other philosophically loaded words.

He got it largely right, though I cannot today recall the details. I only recall I didn’t have much to say about what struck me as a peculiar exercise, dedicated not to understanding my research, but to understanding me.

It was awkward, too, because I have always disliked philosophers’ dissection of scientists’ lives. Their obsessive analyses of who Schrödinger, Einstein, or Bohr talked to when, about what, in which period of what marriage, never made a lot of sense to me. It reeked too much of hero-worship, looked too much like post-mortem psychoanalysis, equally helpful to understand Einstein’s work as cutting his brain into slices.

In the months that followed the Toronto talk, though, I began reading my own blogposts with that postdoc’s interpretation in mind. And I realized that in many cases it was essential information to understand what I was trying to get across. In the past year, I have therefore made more effort to repeat background, or at least link to previous pieces, to provide that necessary context. Context which – of course! – I thought is obvious. Because certainly we all agree what a theory is. Right?

But having written a public weblog for more than 12 years makes me a comparably simple subject of study. I have, over the years, provided explanations for just exactly what I mean when I say “scientific method” or “true” or “real”. So at least you could find out if only you wanted to. Not that I expect anyone who comes here for a 1,000 word essay to study an 800,000 word archive. Still, at least that archive exists. The same, however, isn’t the case for most scientists.

I was reminded of this at a recent workshop where I spoke with another woman about her attempts to make sense of one of her senior colleague’s papers.

I don’t want to name names, but it’s someone whose research you’ll be familiar with if you follow the popular science media. His papers are chronically hard to understand. And I know it isn’t just me who struggles, because I heard a lot of people in the field make dismissive comments about his work. On the occasion which the woman told me about, apparently he got frustrated with his own inability to explain himself, resulting in rather aggressive responses to her questions.

He’s not the only one frustrated. I could tell you many stories of renown physicists who told me, or wrote to me, about their struggles to get people to listen to them. Being white and male, it seems, doesn’t help. Neither do titles, honors, or award-winning popular science books.

And if you look at the ideas they are trying to get across, there’s a pattern.

These are people who have – in some cases over decades – built their own theoretical frameworks, developed personal philosophies of science, invented their own, idiosyncratic way of expressing themselves. Along the way, they have become incomprehensible for anyone else. But they didn’t notice.

Typically, they have written multiple papers circling around a key insight which they never quite manage to bring into focus. They’re constantly trying and constantly failing. And while they usually have done parts of their work with other people, the co-authors are clearly side-characters in a single-fighter story.

So they have their potentially brilliant insights out there, for anyone to see. And yet, no one has the patience to look at their life’s work. No one makes an effort to decipher their code. In brief, no one understands them.

Of course they’re frustrated. Just as frustrated as I am that no one understands me. Not even the people who agree with me. Especially not those, actually. It’s so frustrating.

The issue, I think, is symptomatic of our times, not only in science, but in society at large. Look at any social media site. You will see people going to great lengths explaining themselves just to end up frustrated and – not seldom – aggressive. They are aggressive because no one listens to what they are trying so hard to say. Indeed, all too often, no one even tries. Why bother if misunderstanding is such an easy win? If you cannot explain yourself, that’s your fault. If you do not understand me, that’s also your fault.

And so, what I took away from my Toronto talk is that communication is much more difficult than we usually acknowledge. It takes a lot of patience, both from the sender and the receiver, to accurately decode a message. You need all that context to make sense of someone else’s ideas. I now see why philosophers spend so much time dissecting the lives of other people. And instead of talking so much, I have come to think, I should listen a little more. Who knows, I might finally understand something.

Saturday, August 17, 2019

How we know that Einstein's General Relativity cannot be quite right

Today I want to explain how we know that the way Einstein thought about gravity cannot be correct.

Einstein’s idea was that gravity is not a force, but it is really an effect caused by the curvature of space and time. Matter curves space-time in its vicinity, and this curvature in return affects how matter moves. This means that, according to Einstein, space and time are responsive. They deform in the presence of matter and not only matter, but really all types of energies, including pressure and momentum flux and so on.

Einstein called his theory “General Relativity” because it’s a generalization of Special Relativity. Both are based on “observer-independence”, that is the idea that the laws of nature should not depend on the motion of an observer. The difference between General Relativity and Special Relativity is that in Special Relativity space-time is flat, like a sheet of paper, while in General Relativity it can be curved, like the often-named rubber sheet.

General Relativity is an extremely well-confirmed theory. It predicts that light rays bend around massive objects, like the sun, which we have observed. The same effect also gives rise to gravitational lensing, which we have also observed. General Relativity further predicts that the universe should expand, which it does. It predicts that time runs more slowly in gravitational potentials, which is correct. General Relativity predicts black holes, and it predicts just how the black hole shadow looks, which is what we have observed. It also predicts gravitational waves, which we have observed. And the list goes on.

So, there is no doubt that General Relativity works extremely well. But we already know that it cannot ultimately be the correct theory for space and time. It is an approximation that works in many circumstances, but fails in others.

We know this because General Relativity does not fit together with another extremely well confirmed theory, that is quantum mechanics. It’s one of these problems that’s easy to explain but extremely difficult to solve.

Here is what goes wrong if you want to combine gravity and quantum mechanics. We know experimentally that particles have some strange quantum properties. They obey the uncertainty principle and they can do things like being in two places at once. Concretely, think about an electron going through a double slit. Quantum mechanics tells us that the particle goes through both slits.

Now, electrons have a mass and masses generate a gravitational pull by bending space-time. This brings up the question, to which place does the gravitational pull go if the electron travels through both slits at the same time. You would expect the gravitational pull to also go to two places at the same time. But this cannot be the case in general relativity, because general relativity is not a quantum theory.

To solve this problem, we have to understand the quantum properties of gravity. We need what physicists call a theory of quantum gravity. And since Einstein taught us that gravity is really about the curvature of space and time, what we need is a theory for the quantum properties of space and time.

There are two other reasons how we know that General Relativity can’t be quite right. Besides the double-slit problem, there is the issue with singularities in General Relativity. Singularities are places where both the curvature and the energy-density of matter become infinitely large; at least that’s what General Relativity predicts. This happens for example inside of black holes and at the beginning of the universe.

In any other theory that we have, singularities are a sign that the theory breaks down and has to be replaced by a more fundamental theory. And we think the same has to be the case in General Relativity, where the more fundamental theory to replace it is quantum gravity.

The third reason we think gravity must be quantized is the trouble with information loss in black holes. If we combine quantum theory with general relativity but without quantizing gravity, then we find that black holes slowly shrink by emitting radiation. This was first derived by Stephen Hawking in the 1970s and so this black hole radiation is also called Hawking radiation.

Now, it seems that black holes can entirely vanish by emitting this radiation. Problem is, the radiation itself is entirely random and does not carry any information. So when a black hole is entirely gone and all you have left is the radiation, you do not know what formed the black hole. Such a process is fundamentally irreversible and therefore incompatible with quantum theory. It just does not fit together. A lot of physicists think that to solve this problem we need a theory of quantum gravity.

So this is how we know that General Relativity must be replaced by a theory of quantum gravity. This problem has been known since the 1930s. Since then, there have been many attempts to solve the problem. I will tell you about this some other time, so don’t forget to subscribe.

Tuesday, August 13, 2019

The Problem with Quantum Measurements

Have you heard that particle physicists want a larger collider because there is supposedly something funny about the Higgs boson? They call it the “Hierarchy Problem,” that there are 15 orders of magnitude between the Planck mass, which determines the strength of gravity, and the mass of the Higgs boson.

What is problematic about this, you ask? Nothing. Why do particle physicists think it’s problematic? Because they have been told as students it’s problematic. So now they want $20 billion to solve a problem that doesn’t exist.

Let us then look at an actual problem, that is that we don’t know how a measurement happens in quantum mechanics. The discussion of this problem today happens largely among philosophers; physicists pay pretty much no attention to it. Why not, you ask? Because they have been told as students that the problem doesn’t exist.

But there is a light at the end of the tunnel and the light is… you. Yes, you. Because I know that you are just the right person to both understand and solve the measurement problem. So let’s get you started.

Quantum mechanics is today mostly taught in what is known as the Copenhagen Interpretation and it works as follows. Particles are described by a mathematical object called the “wave-function,” usually denoted Ψ (“Psi”). The wave-function is sometimes sharply peaked and looks much like a particle, sometimes it’s spread out and looks more like a wave. Ψ is basically the embodiment of particle-wave duality.

The wave-function moves according to the Schrödinger equation. This equation is compatible with Einstein’s Special Relativity and it can be run both forward and backward in time. If I give you complete information about a system at any one time – ie, if I tell you the “state” of the system – you can use the Schrödinger equation to calculate the state at all earlier and all later times. This makes the Schrödinger equation what we call a “deterministic” equation.

But the Schrödinger equation alone does not predict what we observe. If you use only the Schrödinger equation to calculate what happens when a particle interacts with a detector, you find that the two undergo a process called “decoherence.” Decoherence wipes out quantum-typical behavior, like dead-and-alive cats and such. What you have left then is a probability distribution for a measurement outcome (what is known as a “mixed state”). You have, say, a 50% chance that the particle hits the left side of the screen. And this, importantly, is not a prediction for a collection of particles or repeated measurements. We are talking about one measurement on one particle.

The moment you measure the particle, however, you know with 100% probability what you have got; in our example you now know which side of the screen the particle is. This sudden jump of the probability is often referred to as the “collapse” of the wave-function and the Schrödinger equation does not predict it. The Copenhagen Interpretation, therefore, requires an additional assumption called the “Measurement Postulate.” The Measurement Postulate tells you that the probability of whatever you have measured must be updated to 100%.

Now, the collapse together with the Schrödinger equation describes what we observe. But the detector is of course also made of particles and therefore itself obeys the Schrödinger equation. So if quantum mechanics is fundamental, we should be able to calculate what happens during measurement using the Schrödinger equation alone. We should not need a second postulate.

The measurement problem, then, is that the collapse of the wave-function is incompatible with the Schrödinger equation. It isn’t merely that we do not know how to derive it from the Schrödinger equation, it’s that it actually contradicts the Schrödinger equation. The easiest way to see this is to note that the Schrödinger equation is linear while the measurement process is non-linear. This strongly suggests that the measurement is an effective description of some underlying non-linear process, something we haven’t yet figured out.

There is another problem. As an instantaneous process, wave-function collapse doesn’t fit together with the speed of light limit in Special Relativity. This is the “spooky action” that irked Einstein so much about quantum mechanics.

This incompatibility with Special Relativity, however, has (by assumption) no observable consequences, so you can try and convince yourself it’s philosophically permissible (and good luck with that). But the problem comes back to haunt you when you ask what happens with the mass (and energy) of a particle when its wave-function collapses. You’ll notice then that the instantaneous jump screws up General Relativity. (And for this quantum gravitational effects shouldn’t play a role, so mumbling “string theory” doesn’t help.) This issue is still unobservable in practice, all right, but now it’s observable in principle.

One way to deal with the measurement problem is to argue that the wave-function does not describe a real object, but only encodes knowledge, and that probabilities should not be interpreted as frequencies of occurrence, but instead as statements of our confidence. This is what’s known as a “Psi-epistemic” interpretation of quantum mechanics, as opposed to the “Psi-ontic” ones in which the wave-function is a real thing.

The trouble with Psi-epistemic interpretations is that the moment you refer to something like “knowledge” you have to tell me what you mean by “knowledge”, who or what has this “knowledge,” and how they obtain “knowledge.” Personally, I would also really like to know what this knowledge is supposedly about, but if you insist I’ll keep my mouth shut. Even so, for all we presently know, “knowledge” is not fundamental, but emergent. Referring to knowledge in the postulates of your theory, therefore, is incompatible with reductionism. This means if you like Psi-epistemic interpretations, you will have to tell me just why and when reductionism breaks down or, alternatively, tell me how to derive Psi from a more fundamental law.

None of the existing interpretations and modifications of quantum mechanics really solve the problem, which I can go through in detail some other time. For now let me just say that either way you turn the pieces, they won’t fit together.

So, forget about particle colliders; grab a pen and get started.


Note: If the comment count exceeds 200, you have to click on “Load More” at the bottom of the page to see recent comments. This is also why the link in the recent comment widget does not work. Please do not complain to me about this shitfuckery. Blogger is hosted by Google. Please direct complaints to their forum.

Saturday, August 10, 2019

Book Review: “The Secret Life of Science” by Jeremy Baumberg

The Secret Life of Science: How It Really Works and Why It Matters
Jeremy Baumberg
Princeton University Press (16 Mar. 2018)

The most remarkable thing about science is that most scientists have no idea how it works. With his 2018 book “The Secret Life of Science,” Jeremy Baumberg aims to change this.

The book is thoroughly researched and well-organized. In the first chapter, Baumberg starts with explaining what science is. He goes about this pragmatically and without getting lost in irrelevant philosophical discussions. In this chapter, he also introduces the terms “simplifier science” and “constructor science” to replace “basic” and “applied” research.

Baumberg suggests to think of science as an ecosystem with multiple species and flows of nutrients that need to be balanced, which is an analogy that he comes back to throughout the book. This first chapter is followed by a brief chapter about the motivations to do science and its societal relevance.

In the next chapters, Baumberg then focuses on various aspects of a scientist’s work-life and explains how these are organized in praxis: Scientific publishing, information sharing in the community (conferences and so on), science communication (PR, science journalism), funding, and hiring. In this, Baumberg make an effort to distinguish between research in academia and in business, and in many cases he also points out national differences.

The book finishes with a chapter about the future of science and Baumberg’s own suggestions for improvement. Except for the very last chapter, the author does not draw attention to existing problems with the current organization of science, though these will be obvious to most readers.

Baumberg is a physicist by training and, according to the book flap, works in nanotechnology and photonics. As most physicists who do not work in particle physics, he is well aware that particle physics is in deep trouble. He writes:
Knowing the mind of god” and “The theory of everything” are brands currently attached to particle physics. Yet they have become less powerful with time, attracting an air of liability, perhaps reaching that of a “toxic brand.” That the science involved now finds it hard to shake off precisely this layer of values attached to them shows how sticky they are.
The book contains a lot of concrete information for example about salaries and grant success rates. I have generally found Baumberg’s analysis to be spot on, for example when he writes “Science spending seems to rise until it becomes noticed and then stops.” Or
Because this competition [for research grants] is so well defined as a clear race for money it can become the raison d’etre for scientists’ existence, rather than just what is needed to develop resources to actually do science.
On counting citations, he likewise remarks aptly:
“[The h-index rewards] wide collaborators rather than lone specialists, rewards fields that cite more, and rewards those who always stay at the trendy edge of all research.”
Unfortunately I have to add that the book is not particularly engagingly written. Some of the chapters could have been shorter, Baumberg overuses the metaphor of the ecosystem, and the figures are not helpful. To give you an idea why I say this, I challenge you to make sense of this illustration:

In summary, Baumberg’s is a useful book though it’s somewhat tedious to read. Nevertheless, I think everyone who wants to understand how science works in reality should read it. It’s time we get over the idea that science somehow magically self-corrects. Science is the way we organize knowledge discovery, and its success depends on us paying attention to how it is organized.

Wednesday, August 07, 2019

10 differences between artificial intelligence and human intelligence

Today I want to tell you what is artificial about artificial intelligence. There is, of course, the obvious, which is that the brain is warm, wet, and wiggly, while a computer is not. But more importantly, there are structural differences between human and artificial intelligence, which I will get to in a moment.

Before we can talk about this though, I have to briefly tell you what “artificial intelligence” refers to.

What goes as “artificial intelligence” today are neural networks. A neural network is a computer algorithm that imitates certain functions of the human brain. It contains virtual “neurons” that are arranged in “layers” which are connected with each other. The neurons pass on information and thereby perform calculations, much like neurons in the human brain pass on information and thereby perform calculations.

In the neural net, the neurons are just numbers in the code, typically they have values between 0 and 1. The connections between the neurons also have numbers associated with them, and those are called “weights”. These weights tell you how much the information from one layer matters for the next layer.

The values of the neurons and the weights of the connections are essentially the free parameters of the network. And by training the network you want to find those values of the parameters that minimize a certain function, called the “loss function”.

So it’s really an optimization problem that neural nets solve. In this optimization, the magic of neural nets happens through what is known as backpropagation. This means if the net gives you a result that is not particularly good, you go back and change the weights of the neurons and their connections. This is how the net can “learn” from failure. Again, this plasticity mimics that of the human brain.

For a great introduction to neural nets, I can recommend this 20 minutes video by 3Blue1Brown.

Having said this, here are the key differences between artificial and real intelligence.

1. Form and Function

A neural net is software running on a computer. The “neurons” of an artificial intelligence are not physical. They are encoded in bits and strings on hard disks or silicon chips and their physical structure looks nothing like that of actual neurons. In the human brain, in contrast, form and function go together.

2. Size

The human brain has about 100 billion neurons. Current neural nets typically have a few hundred or so.

3. Connectivity

In a neural net each layer is usually fully connected to the previous and next layer. But the brain doesn’t really have layers. It instead relies on a lot of pre-defined structure. Not all regions of the human brain are equally connected and the regions are specialized for certain purposes.

4. Power Consumption

The human brain is dramatically more energy-efficient than any existing artificial intelligence. The brain uses around 20 Watts, which is comparable to what a standard laptop uses today. But with that power the brain handles a million times more neurons.

5. Architecture

In a neural network, the layers are neatly ordered and are addressed one after the other. The human brain, on the other hand, does a lot of parallel processing and not in any particular order.

6. Activation Potential

In the real brain neurons either fire or don’t. In a neural network the firing is mimicked by continuous values instead, so the artificial neurons can smoothly slide from off to on, which real neurons can’t.

7. Speed

The human brain is much, much slower than any artificially intelligent system. A standard computer performs some 10 billion operations per second. Real neurons, on the other hand, fire at a frequency of at most a thousand times per second.

8. Learning Technique

Neural networks learn by producing output, and if this output is of low performance according to the loss function, then the net responds by changing the weights of the neurons and their connections. No one knows in detail how humans learn, but that’s not how it works.

9. Structure

A neural net starts from scratch every time. The human brain, on the other hand, has a lot of structure already wired into its connectivity, and it draws on models which have proved useful during evolution.

10. Precision

The human brain is much more noisy and less precise than a neural net running on a computer. This means the brain basically cannot run the same learning mechanism as a neural net and it’s probably using an entirely different mechanism.

A consequence of these differences is that artificial intelligence today needs a lot of training with a lot of carefully prepared data, which is very unlike to how human intelligence works. Neural nets do not build models of the world, instead they learn to classify patterns, and this pattern recognition can fail with only small changes. A famous example is that you can add small amounts of noise to an image, so small amounts that your eyes will not see a difference, but an artificially intelligent system might be fooled into thinking a turtle is a rifle.

Neural networks are also presently not good at generalizing what they have learned from one situation to the next, and their success very strongly depends on defining just the correct “loss function”. If you don’t think about that loss function carefully enough, you will end up optimizing something you didn’t want. Like this simulated self-driving car trained to move at constant high speed, which learned to rapidly spin in a circle.

But neural networks excel at some things, such as classifying images or extrapolating data that doesn’t have any well-understood trend. And maybe the point of artificial intelligence is not to make it all that similar to natural intelligence. After all, the most useful machines we have, like cars or planes, are useful exactly because they do not mimic nature. Instead, we may want to build machines specialized in tasks we are not good at.

Tuesday, August 06, 2019

Special Breakthrough Prize awarded for Supergravity

Breakthrough Prize Trophy.
[Image: Breakthrough Prize]
The Breakthrough Prize is an initiative founded by billionaire Yuri Milner, now funded by a group of rich people which includes, next to Milner himself, Sergey Brin, Anne Wojcicki, and Mark Zuckerberg. The Prize is awarded in three different categories, Mathematics, Fundamental Physics, and Life Sciences. Today, a Special Breakthrough Prize in Fundamental Physics has been awarded to Sergio Ferrara, Dan Freedman, and Peter van Nieuwenhuizen for the invention of supergravity in 1976. The Prize of 3 million US$ will be split among the winners.

Interest in supergravity arose in the 1970s when physicists began to search for a theory of everything that would combine all four known fundamental forces to one. By then, string theory had been shown to require supersymmetry, a hypothetical new symmetry which implies that all the already known particles have – so far undiscovered – partner particles. Supersymmetry, however, initially only worked for the three non-gravitational forces, that is the electromagnetic force and the strong and weak nuclear forces. With supergravity, gravity could be included too, thereby bringing physicists one step closer to their goal of unifying all the interactions.

In supergravity, the gravitational interaction is associated with a messenger particle – the graviton – and this graviton has a supersymmetric partner particle called the “gravitino”. There are several types of supergravitational theories, because there are different ways of realizing the symmetry. Supergravity in the context of string theory always requires additional dimensions of space, which have not been seen. The gravitational theory one obtains this way is also not the same as Einstein’s General Relativity, because one gets additional fields that can be difficult to bring into agreement with observation. (For more about the problems with string theory, please watch my video.)

To date, we have no evidence that supergravity is a correct description of nature. Supergravity may one day become useful to calculate properties of certain materials, but so far this research direction has not led to much.

The works by Ferrera, Freedman, and van Nieuwenhuizen have arguably been influential, if by influential you mean that papers have been written about it. Supergravity and supersymmetry are mathematically very fertile ideas. They lend themselves to calculations that otherwise would not be possible and that is how, in the past four decades, physicists have successfully built a beautiful, supersymmetric, math-castle on nothing but thin air.

Awarding a scientific prize, especially one accompanied by so much publicity, for an idea that has no evidence speaking for it, sends the message that in the foundations of physics contact to observation is no longer relevant. If you want to be successful in my research area, it seems, what matters is that a large number of people follow your footsteps, not that your work is useful to explain natural phenomena. This Special Prize doesn’t only signal to the public that the foundations of physics are no longer part of science, it also discourages people in the field from taking on the hard questions. Congratulations.

Update Aug 7th: Corrected the first paragraph. The earlier version incorrectly stated that each of the recipients gets $3 million.

Thursday, August 01, 2019

Automated Discovery


In 1986, Dan Swanson from the University of Chicago discovered a discovery.

Swanson (who passed away in 2012) was an information scientist and a pioneer in literature analysis. In the 1980s, he studied the distribution of references in scientific papers and found that, on occasion, studies on two separate research topics would have few references between them, but would refer to a common, third, set of papers. He conjectured that might indicate so-far unknown links between the separate research topics.

Indeed, Swanson found a concrete example for such a link. Already in the 1980s, scientists knew that certain types of fish oils benefit blood composition and blood vessels. So there was one body of literature linking circulatory health to fish oil. They had also found, in another line of research, that patients with Raynaud’s disease do better if their circulatory health improves. This led Swanson to conjecture that patients with Raynaud’s disease could benefit from fish oil. In 1993, a clinical trial demonstrated that this hypothesis was correct.

You may find this rather obvious. I would agree it’s not a groundbreaking insight, but this isn’t the point. The point is that the scientific community missed this obvious insight. It was right there, in front of their eyes, but no one noticed.

30 years after Swanson’s seminal paper, we have more data than ever about scientific publications. And just the other week, Nature published a new example for what you can do with it.

In the new paper, a group of researchers from California studied the materials science literature. They did not, like Swanson, look for relations between research studies by using citations, but they did a (much more computationally intensive) word-analysis of paper abstracts (not unlike the one we did in our paper). This analysis serves to identify the most relevant words associated with a manuscript, and to find relations between these words.

Previous studies have shown that words, treated as vectors in a high-dimensional space, can be added and subtracted. The most famous example is that the combination “King – Man + Woman” gives a new vector that turns out to be associated with the word “Queen”. In the new paper, the authors report finding similar examples in the materials science literature, such as “ferromagnetic −  NiFe + IrMn” which adds together to “antiferromagnetic”.

Even more remarkable though, they noticed that a number of materials whose names are close to the word “thermoelectric” were never actually mentioned together with the word “thermoelectric” in any paper’s abstract. This suggests, so the authors claim, that these materials may be thermoelectric, but so-far no one has noticed.

They have tested how well this works by making back-dated predictions for the discovery of new thermoelectric materials using only papers published until one of the years between 2001 and 2018. For each of these historical datasets, they used the relations between words in the abstracts to predict 50 thermoelectrical materials most likely to be found in the future. And it worked! In the five years after the historical data-cut, the identified materials were on average eight times more likely to be studied as thermoelectrics than were randomly chosen unstudied materials. The authors have now also made real predictions for new thermoelectric materials. We will see in the coming years how those pan out.

I think that analyses like this have lot of potential. Indeed, one of the things that keeps me up at night is the possibility that we might already have all the knowledge necessary to make progress in the foundations of physics, we just haven’t connected the dots. Smart tools to help scientists decide what papers to pay attention to could greatly aid knowledge discovery.