Showing posts with label Sociology of Science. Show all posts
Showing posts with label Sociology of Science. Show all posts

Friday, June 30, 2017

To understand the foundations of physics, study numerology

Numbers speak. [Img Src]
Once upon a time, we had problems in the foundations of physics. Then we solved them. That was 40 years ago. Today we spend most of our time discussing non-problems.

Here is one of these non-problems. Did you know that the universe is spatially almost flat? There is a number in the cosmological concordance model called the “curvature parameter” that, according to current observation, has a value of 0.000 plus-minus 0.005.

Why is that a problem? I don’t know. But here is the story that cosmologists tell.

From the equations of General Relativity you can calculate the dynamics of the universe. This means you get relations between the values of observable quantities today and the values they must have had in the early universe.

The contribution of curvature to the dynamics, it turns out, increases relative to that of matter and radiation as the universe expands. This means for the curvature-parameter to be smaller than 0.005 today, it must have been smaller than 10-60 or so briefly after the Big Bang.

That, so the story goes, is bad, because where would you get such a small number from?

Well, let me ask in return, where do we get any number from anyway? Why is 10-60 any worse than, say, 1.778, or exp(67π)?

That the curvature must have had a small value in the early universe is called the “flatness problem,” and since it’s on Wikipedia it’s officially more real than me. And it’s an important problem. It’s important because it justifies the many attempts to solve it.

The presently most popular solution to the flatness problem is inflation – a rapid period of expansion briefly after the Big Bang. Because inflation decreases the relevance of curvature contributions dramatically – by something like 200 orders of magnitude or so – you no longer have to start with some tiny value. Instead, if you start with any curvature parameter smaller than 10197, the value today will be compatible with observation.

Ah, you might say, but clearly there are more numbers smaller than 10197 than there are numbers smaller than 10-60, so isn’t that an improvement?

Unfortunately, no. There are infinitely many numbers in both cases. Besides that, it’s totally irrelevant. Whatever the curvature parameter, the probability to get that specific number is zero regardless of its value. So the argument is bunk. Logical mush. Plainly wrong. Why do I keep hearing it?

Worse, if you want to pick parameters for our theories according to a uniform probability distribution on the real axis, then all parameters would come out infinitely large with probability one. Sucks. Also, doesn’t describe observations*.

And there is another problem with that argument, namely, what probability distribution are we even talking about? Where did it come from? Certainly not from General Relativity because a theory can’t predict a distribution on its own theory space. More logical mush.

If you have trouble seeing the trouble, let me ask the question differently. Suppose we’d manage to measure the curvature parameter today to a precision of 60 digits after the point. Yeah, it’s not going to happen, but bear with me. Now you’d have to explain all these 60 digits – but that is as fine-tuned as a zero followed by 60 zeroes would have been!

Here is a different example for this idiocy. High energy physicists think it’s a problem that the mass of the Higgs is 15 orders of magnitude smaller than the Planck mass because that means you’d need two constants to cancel each other for 15 digits. That’s supposedly unlikely, but please don’t ask anyone according to which probability distribution it’s unlikely. Because they can’t answer that question. Indeed, depending on character, they’ll either walk off or talk down to you. Guess how I know.

Now consider for a moment that the mass of the Higgs was actually about as large as the Planck mass. To be precise, let’s say it’s 1.1370982612166126 times the Planck mass. Now you’d again have to explain how you get exactly those 16 digits. But that is, according to current lore, not a finetuning problem. So, erm, what was the problem again?

The cosmological constant problem is another such confusion. If you don’t know how to calculate that constant – and we don’t, because we don’t have a theory for Planck scale physics – then it’s a free parameter. You go and measure it and that’s all there is to say about it.

And there are more numerological arguments in the foundations of physics, all of which are wrong, wrong, wrong for the same reasons. The unification of the gauge couplings. The so-called WIMP-miracle (RIP). The strong CP problem. All these are numerical coincidence that supposedly need an explanation. But you can’t speak about coincidence without quantifying a probability!

Do my colleagues deliberately lie when they claim these coincidences are problems, or do they actually believe what they say? I’m not sure what’s worse, but suspect most of them actually believe it.

Many of my readers like jump to conclusions about my opinions. But you are not one of them. You and I, therefore, both know that I did not say that inflation is bunk. Rather I said that the most common arguments for inflation are bunk. There are good arguments for inflation, but that’s a different story and shall be told another time.

And since you are among the few who actually read what I wrote, you also understand I didn’t say the cosmological constant is not a problem. I just said its value isn’t the problem. What actually needs an explanation is why it doesn’t fluctuate. Which is what vacuum fluctuations should do, and what gives rise to what Niayesh called the cosmological non-constant problem.

Enlightened as you are, you would also never think I said we shouldn’t try to explain the value of some parameter. It is always good to look for better explanations for the assumption underlying current theories – where by “better” I mean either simpler or can explain more.

No, what draws my ire is that most of the explanations my colleagues put forward aren’t any better than just fixing a parameter through measurement  – they are worse. The reason is the problem they are trying to solve – the smallness of some numbers – isn’t a problem. It’s merely a property they perceive as inelegant.

I therefore have a lot of sympathy for philosopher Tim Maudlin who recently complained that “attention to conceptual clarity (as opposed to calculational technique) is not part of the physics curriculum” which results in inevitable confusion – not to mention waste of time.

In response, a pseudoanonymous commenter remarked that a discussion between a physicist and a philosopher of physics is “like a debate between an experienced car mechanic and someone who has read (or perhaps skimmed) a book about cars.”

Trouble is, in the foundations of physics today most of the car mechanics are repairing cars that run just fine – and then bill you for it.

I am not opposed to using aesthetic arguments as research motivations. We all have to get our inspiration from somewhere. But I do think it’s bad science to pretend numerological arguments are anything more than appeals to beauty. That very small or very large numbers require an explanation is a belief – and it’s a belief that has become adapted by the vast majority of the community. That shouldn’t happen in any scientific discipline.

As a consequence, high energy physics and cosmology is now populated with people who don’t understand that finetuning arguments have no logical basis. The flatness “problem” is preached in textbooks. The naturalness “problem” is all over the literature. The cosmological constant “problem” is on every popular science page. And so the myths live on.

If you break down the numbers, it’s me against ten-thousand of the most intelligent people on the planet. Am I crazy? I surely am.


*Though that’s exactly what happens with bare values.

Thursday, April 06, 2017

Dear Dr. B: Why do physicists worry so much about the black hole information paradox?

    “Dear Dr. B,

    Why do physicists worry so much about the black hole information paradox, since it looks like there are several, more mundane processes that are also not reversible? One obvious example is the increase of the entropy in an isolated system and another one is performing a measurement according to quantum mechanics.

    Regards, Petteri”


Dear Petteri,

This is a very good question. Confusion orbits the information paradox like accretion disks orbit supermassive black holes. A few weeks ago, I figured even my husband doesn’t really know what the problem is, and he doesn’t only have a PhD in physics, he has also endured me rambling about the topic for more than 15 years!

So, I’m happy to elaborate on why theorists worry so much about black hole information. There are two aspects to this worry: one scientific and one sociological. Let me start with the scientific aspect. I’ll comment on the sociology below.

In classical general relativity, black holes aren’t much trouble. Yes, they contain a singularity where curvature becomes infinitely large – and that’s deemed unphysical – but the singularity is hidden behind the horizon and does no harm.

As Stephen Hawking pointed out, however, if you take into account that the universe – even vacuum – is filled with quantum fields of matter, you can calculate that black holes emit particles, now called “Hawking radiation.” This combination of unquantized gravity with quantum fields of matter is known as “semi-classical” gravity, and it should be a good approximation as long as quantum effects of gravity can be neglected, which means as long as you’re not close by the singularity.

Illustration of black hole with jet and accretion disk.
Image credits: NASA.


Hawking radiation consists of pairs of entangled particles. Of each pair, one particle falls into the black hole while the other one escapes. This leads to a net loss of mass of the black hole, ie the black hole shrinks. It loses mass until entirely evaporated and all that’s left are the particles of the Hawking radiation which escaped.

Problem is, the surviving particles don’t contain any information about what formed the black hole. And not only that, information of the particles’ partners that went into the black hole is also lost. If you investigate the end-products of black hole evaporation, you therefore can’t tell what the initial state was; the only quantities you can extract are the total mass, charge, and angular momentum- the three “hairs” of black holes (plus one qubit). Black hole evaporation is therefore irreversible.



Irreversible processes however don’t exist in quantum field theory. In technical jargon, black holes can turn pure states into mixed states, something that shouldn’t ever happen. Black hole evaporation thus gives rise to an internal contradiction, or “inconsistency”: You combine quantum field theory with general relativity, but the result isn’t compatible with quantum field theory.

To address your questions: Entropy increase usually does not imply a fundamental irreversibility, but merely a practical one. Entropy increases because the probability to observe the reverse process is small. But fundamentally, any process is reversible: Unbreaking eggs, unmixing dough, unburning books – mathematically, all of this can be described just fine. We merely never see this happening because such processes would require exquisitely finetuned initial conditions. A large entropy increase makes a process irreversible in practice, but not irreversible in principle.

That is true for all processes except black hole evaporation. No amount of finetuning will bring back the information that was lost in a black hole. It’s the only known case of a fundamental irreversibility. We know it’s wrong, but we don’t know exactly what’s wrong. That’s why we worry about it.

The irreversibility in quantum mechanics, which you are referring to, comes from the measurement process, but black hole evaporation is irreversible already before a measurement was made. You could argue then, why should it bother us if everything we can possibly observe requires a measurement anyway? Indeed, that’s an argument which can and has been made. But in and by itself it doesn’t remove the inconsistency. You still have to demonstrate just how to reconcile the two mathematical frameworks.

This problem has attracted so much attention because the mathematics is so clear-cut and the implications are so deep. Hawking evaporation relies on the quantum properties of matter fields, but it does not take into account the quantum properties of space and time. It is hence widely believed that quantizing space-time is necessary to remove the inconsistency. Figuring out just what it would take to prevent information loss would teach us something about the still unknown theory of quantum gravity. Black hole information loss, therefore, is a lovely logical puzzle with large potential pay-off – that’s what makes it so addictive.

Now some words on the sociology. It will not have escaped your attention that the problem isn’t exactly new. Indeed, its origin predates my birth. Thousands of papers have been written about it during my lifetime, and hundreds of solutions have been proposed, but theorists just can’t agree on one. The reason is that they don’t have to: For the black holes which we observe (eg at the center of our galaxy), the temperature of the Hawking radiation is so tiny there’s no chance of measuring any of the emitted particles. And so, black hole evaporation is the perfect playground for mathematical speculation.

[Lots of Papers. Img: 123RF]
There is an obvious solution to the black hole information loss problem which was pointed out already in early days. The reason that black holes destroy information is that whatever falls through the horizon ends up in the singularity where it is ultimately destroyed. The singularity, however, is believed to be a mathematical artifact that should no longer be present in a theory of quantum gravity. Remove the singularity and you remove the problem.

Indeed, Hawking’s calculation breaks down when the black hole has lost almost all of its mass and has become so small that quantum gravity is important. This would mean the information would just come out in the very late, quantum gravitational, phase and no contradiction ever occurs.

This obvious solution, however, is also inconvenient because it means that nothing can be calculated if one doesn’t know what happens nearby the singularity and in strong curvature regimes which would require quantum gravity. It is, therefore, not a fruitful idea. Not many papers can be written about it and not many have been written about it. It’s much more fruitful to assume that something else must go wrong with Hawking’s calculation.

Sadly, if you dig into the literature and try to find out on which grounds the idea that information comes out in the strong curvature phase was discarded, you’ll find it’s mostly sociology and not scientific reasoning.

If the information is kept by the black hole until late, this means that small black holes must be able to keep many different combinations of information inside. There are a few papers which have claimed that these black holes then must emit their information slowly, which means small black holes would behave like a technically infinite number of particles. In this case, so the claim, they should be produced in infinite amounts even in weak background fields (say, nearby Earth), which is clearly incompatible with observation.

Unfortunately, these arguments are based on an unwarranted assumption, namely that the interior of small black holes has a small volume. In GR, however, there isn’t any obvious relation between surface area and volume because space can be curved. The assumption that such small black holes, for which quantum gravity is strong, can be effectively described as particles is equally shaky. (For details and references, please see this paper I wrote with Lee some years ago.)

What happened, to make a long story short, is that Lenny Susskind wrote a dismissive paper about the idea that information is kept in black holes until late. This dismissal gave everybody else the opportunity to claim that the obvious solution doesn’t work and to henceforth produce endless amounts of papers on other speculations.

Excuse the cynicism, but that’s my take on the situation. I’ll even admit having contributed to the paper pile because that’s how academia works. I too have to make a living somehow.

So that’s the other reason why physicists worry so much about the black hole information loss problem: Because it’s speculation unconstrained by data, it’s easy to write papers about it, and there are so many people working on it that citations aren’t hard to come by either.

Thanks for an interesting question, and sorry for the overly honest answer.

Thursday, December 08, 2016

No, physicists have no fear of math. But they should have more respect.

Heart curve. [Img Src]
Even physicists are `afraid’ of mathematics,” a recent phys.org headline screamed at me. This, I thought, is ridiculous. You can accuse physicists of many stupidities, but being afraid of math isn’t one of them.

But the headline was supposedly based on scientific research. Someone, somewhere, had written a paper claiming that physicists are more likely to cite papers which are light on math. So, I put aside my confirmation bias and read the paper. It was more interesting than expected.

The paper in question, it turned out, didn’t show that physicists are afraid of math. Instead, it was a reply to a comment on an analysis of an earlier paper which had claimed that biologists are afraid of math.

The original paper, “Heavy use of equations impedes communication among biologists,” was published in 2012 by Tim Fawcett and Andrew Higginson, both at the Centre for Research in Animal Behaviour at the University of Exeter. They analyzed a sample of 649 papers published in the top journals in ecology and evolution and looked for a correlation between the density of equations (equations per text) and the number of citations. They found a statistically significant negative correlation: Papers with a higher density of equations were less cited.

Unexpectedly, a group of physicists came to the defense of biologists. In a paper published last year under the title “Are physicists afraid of mathematics?” Jonathan Kollmer, Thorsten Pöschel, and Jason Galla set out to demonstrate that the statistics underlying the conclusion that biologists are afraid of math were fundamentally flawed. With these methods, the authors claimed, you could show anything, even that physicists are afraid of math. Which is surely absurd. Right? They argued that Fawcett and Higginson had arrived at a wrong conclusion because they had sorted their data into peculiar and seemingly arbitrarily chosen bins.

It’s a good point to make. The chance that you find a correlation with any one binning is much higher than the chance that you find it with one particular binning. Therefore, you can easily screw over measures of statistical significance if you allow a search for a correlation with different binnings.

As example, Kollmer at al used a sample of papers from Physical Review Letters (PRL) and showed that, with the bins used by Fawcett and Higginson, physicists too could be said to be afraid of math. Alas, the correlation goes away with a finer binning and hence is meaningless.

PRL, for those not familiar with it, is one of the most highly ranked journals in physics generally. It publishes papers from all subfields that are of broad interest to the community. PRL also has a strictly enforced page limit: You have to squeeze everything on four pages – an imo completely idiotic policy that more often than not means the authors have to publish a longer, comprehensible, paper elsewhere.

The paper that now made headline is a reply by the authors of the original study to the physicists who criticized it. Fawcett and Higginson explain that the physicists’ data analysis is too naïve. They point out that the citation rates have a pronounced rich-get-richer trend which amplifies any initial differences. This leads to an `overdispersed’ data set in which the standard errors are misleading. In that case, a more complicated statistical analysis is necessary, which is the type of analysis they had done in the original paper. The arbitrarily seeming bins were just chosen to visualize the results, they write, but their finding is independent of that.

Fawcett and Higginson then repeated the same analysis on the physics papers and revealed a clear trend: Physicists too are more likely to cite papers with a smaller density of equations!

I have to admit this doesn’t surprise me much. A paper with fewer verbal explanations per equation assumes the reader is more familiar with the particular formalism being used, and this means the target audience shrinks. The consequence is fewer citations.

But this doesn’t mean physicists are afraid of math, it merely means they have to decide which calculations are worth their time. If it’s a topic they might never have an application for, making their way through a paper heavy on math might not be the so helpful to advance their research. On the other hand, reading a more general introduction or short survey with fewer equations might be useful also on topics farther from one’s own research. These citation habits therefore show mostly that the more specialized a paper, the fewer people will read it.

I had a brief exchange with Andrew Higginson, one of the authors of the paper that’s been headlined as “Physicists are afraid of math.” He emphasizes that their point was that “busy scientists might not have time to digest lots of equations without accompanying text.” But I don’t think that’s the right conclusion to draw. Busy scientists who are familiar with the equations might not have the time to digest much text, and busy scientists might not have the time to digest long papers, period. (The corresponding author of the physicists’ study did not respond to my question for comment.)

In their recent reply, the Fawcett and Higginson suggest that “an immediate, pragmatic solution to this apparent problem would be to reduce the density of equations and add explanatory text for non-specialised readers.”

I’m not sure, however, there is any problem here in need of being solved. Adding text for non-specialized readers might be cumbersome for the specialized readers. I understand the risk that the current practice exaggerates the already pronounced specialization, which can hinder communication. But this, I think, would be better taken care of by reviews and overview papers to be referenced in the, typically short, papers on recent research.

So, I don’t think physicists are afraid of math. Indeed, it sometimes worries me how much and how uncritically they love math.

Math can do a lot of things for you, but in the end it’s merely a device to derive consequences from assumptions. Physics isn’t math, however, and physics papers don’t work by theorems and proofs. Theoretical physicists pride themselves on their intuition and frequently take the freedom to shortcut mathematical proofs by drawing on experience. This, however, amounts to making additional assumptions, for example that a certain relation holds or an expansion is well-defined.

That works well as long as these assumptions are used to arrive at testable predictions. In that case it matters only if the theory works, and the mathematical rigor can well be left to mathematical physicists for clean-up, which is how things went historically.

But today in the foundations of physics, theory-development proceeds largely without experimental feedback. In such cases, keeping track of assumptions is crucial – otherwise it becomes impossible to tell what really follows from what. Or, I should say, it would be crucial because theoretical physicists are bad at this.

The result is that some research areas can amass loosely connected arguments that follow from a set of assumptions that aren’t written down anywhere. This might result in an entirely self-consistent construction and yet not have anything to do with reality. If the underlying assumptions aren’t written down anywhere, the result is conceptual mud in which case we can’t tell philosophy from mathematics.

One such unwritten assumption that is widely used, for example, is the absence of finetuning or that a physical theory be “natural.” This assumption isn’t supported by evidence and it can’t be mathematically derived. Hence, it should be treated as a hypothesis - but that isn’t happening because the assumption itself isn’t recognized for what it is.

Another unwritten assumption is that more fundamental theories should somehow be simpler. This is reflected for example in the belief that the gauge couplings of the standard model should meet in one point. That’s an assumption; it isn’t supported by evidence. And yet it’s not treated as a hypothesis but as a guide to theory-development.

And all presently existing research on the quantization of gravity rests on the assumption that quantum theory itself remains unmodified at short distance scales. This is another assumption that isn’t written down anywhere. Should that turn out to be not true, decades of research will have been useless.

In lack of experimental guidance, what we need in the foundations of physics is conceptual clarity. We need rigorous math, not claims to experience, intuition, and aesthetic appeal. Don’t be afraid, but we need more math.

Friday, June 24, 2016

Wissenschaft auf Abwegen

Ich war am Montag in Regensburg und habe dort einen öffentlichen Vortrag gegeben zum Thema “Wissenschaft auf Abwegen” für eine Reihe unter dem Titel “Was ist Wirklich?” Das ganze ist jetzt auf YouTube. Das Video besteht aus etwa 30 Minuten Vortrag und danach noch eine Stunde Diskussion. Alles in Deutsch. Nur was für eche Fans ;)

Monday, June 13, 2016

String phenomenology of the somewhat different kind

[Cat’s cradle. Image Source.]
Ten years ago, I didn’t take the “string wars” seriously. To begin with, referring to such an esoteric conflict as “war” seems disrespectful to millions caught in actual wars. In comparison to their suffering it’s hard to take anything seriously.

Leaving aside my discomfort with the nomenclature, the focus on string theory struck me as odd. String theory as a research area stands out in hep-th and gr-qc merely because of the large number of followers, not by the supposedly controversial research practices. For anybody working in the field it is apparent that string theorists don’t differ in their single-minded focus from physicists in other disciplines. Overspecialization is a common disease of academia, but one that necessarily goes along with division of labor, and often it is an efficient route to fast progress.

No, I thought back then, string theory wasn’t the disease, it was merely a symptom. The underlying disease was one that would surely soon be recognized and addressed: Theoreticians – as scientists whose most-used equipment is their own brain – must be careful to avoid systematic bias introduced by their apparatuses. In other words, scientific communities, and especially those which lack timely feedback by data, need guidelines to avoid social and cognitive biases.

This is so obvious it came as a surprise to me that, in 2006, everybody was hitting on Lee Smolin for pointing out what everybody knew anyway, that string theorists, lacking experimental feedback for decades, had drifted off in a math bubble with questionable relevance for the description of nature. It’s somewhat ironic that, from my personal experience, the situation is actually worse in Loop Quantum Gravity, an approach pioneered, among others, by Lee Smolin. At least the math used by string theorists seems to be good for something. The same cannot be said about LQG.

Ten years later, it is clear that I was wrong in thinking that just drawing attention to the problem would seed a solution. Not only has the situation not improved, it has worsened. We now have some theoretical physicists who argue that we should alter the scientific method so that the success of a theory can be assessed by means other than empirical evidence. This idea, which has sprung up in the philosophy community, isn’t all that bad in principle. In practice, however, it will merely serve to exacerbate social streamlining: If theorists can draw on criteria other than the ability of a theory to explain observations, the first criterion they’ll take into account is aesthetic value, and the second is popularity with their colleagues. Nothing good can come out of this.

And nothing good has come out of it, nothing has changed. The string wars clearly were more interesting for sociologists than they were for physicists. In the last couple of months several articles have appeared which comment on various aspects of this episode, which I’ve read and want to briefly summarize for you.

First, there is
    Collective Belief, Kuhn, and the String Theory Community
    Weatherall, James Owen and Gilbert, Margaret
    philsci-archive:11413
This paper is a very Smolin-centric discussion of whether string theorists are exceptional in their group beliefs. The authors argue that, no, actually string theorists just behave like normal humans and “these features seem unusual to Smolin not because they are actually unusual, but because he occupies an unusual position from which to observe them.” He is unusual, the authors explain, for having worked on string theory, but then deciding to not continue in the field.

It makes sense, the authors write, that people whose well-being to some extent depends on the acceptance by the group will adapt to the group:
“Expressing a contrary view – bucking the consensus – is an offense against the other members of the community… So, irrespective of their personal beliefs, there are pressures on individual scientists to speak in certain ways. Moreover, insofar as individuals are psychologically disposed to avoid cognitive dissonance, the obligation to speak in certain ways can affect one’s personal beliefs so as to bring them into line with the consensus, further suppressing dissent from within the group.”
Furthermore:
“As parties to a joint commitment, members of the string theory community are obligated to act as mouthpieces of their collective belief.”
I actually thought we knew this since 1895, when Le Bon’s published his “Study of the Popular Mind.”

The authors of the paper then point out that it’s normal for members of a scientific community to not jump ship at the slightest indication of conflicting evidence because often such evidence turns out to be misleading. It didn’t become clear to me what evidence they might be referring to; supposedly it’s non-empirical.

They further argue that a certain disregard for what is happening outside one’s own research area is also normal: “Science is successful in part because of a distinctive kind of focused, collaborative research,” and due to their commitment to the agenda “participants can be expected to resist change with respect to the framework of collective beliefs.”

This is all reasonable enough. Unfortunately, the authors entirely miss the main point, the very reason for the whole debate. The question isn’t whether string theorists’ behavior is that of normal humans – I don’t think that was ever in doubt – but whether that “normal human behavior” is beneficial for science. Scientific research requires, in a very specific sense, non-human behavior. It’s not normal for individuals to disregard subjective assessments and to not pay attention to social pressure. And yet, that is exactly what good science would require.

The second paper is
This paper is basically a summary of the string wars that focuses on the question whether or not string theory can be considered science. This “demarcation problem” is a topic that philosophers and sociologists love to discuss, but to me it really isn’t particularly interesting how you classify some research area, to me the question is whether it’s good for something. This is a question which should be decided by the community, but as long as decision making is influenced by social pressures and cognitive biases I can’t trust the community judgement.

The article has a lot of fun quotations from very convinced string theorists, for example by David Gross: “String theory is full of qualitative predictions, such as the production of black holes at the LHC.” I’m not sure what’s the difference between a qualitative prediction and no prediction, but either way it’s certainly not a prediction that was very successful. Also nice is John Schwarz claiming that “supersymmetry is the major prediction of string theory that could appear at accessible energies” and that “some of these superpartners should be observable at the LHC.” Lots of coulds and shoulds that didn’t quite pan out.

While the article gives a good overview on the opinions about string theory that were voiced during the 2006 controversy, the authors themselves clearly don’t know very well the topic they are writing about. A particularly odd statement that highlights their skewed perspective is: “String theory currently enjoys a privileged status by virtue of being the dominant paradigm within theoretical physics.”

I find it quite annoying how frequently I encounter this extrapolation from a particular research area – may that be string theory, supersymmetry, or multiverse cosmology – to all of physics. The vast majority of physicists work in fields like quantum optics, photonics, hadronic and nuclear physics, statistical mechanics, atomic physics, solid state physics, low-temperature physics, plasma physics, astrophysics, condensed matter physics, and so on. They have nothing whatsoever to do with string theory, and certainly would be very surprised to hear that it’s “the dominant paradigm.”

In any case, you might find this paper useful if you didn’t follow the discussion 10 years ago.

Finally, there is this paper

The title of the paper doesn’t explicitly refer to string theory, but most of it is also a discussion of the demarcation problem on the example of arXiv trackbacks. (I suspect this paper is a spin-off of the previous paper.)

ArXiv trackbacks, in case you didn’t know, are links to blogposts that show up on some papers’ arxiv sites, when the blogpost has referred to the paper. To exactly which blogs trackbacks show up and who makes the decision whether they do is one of the arXiv’s best-kept secrets. Peter Woit’s blog, infamously, doesn’t show up in the arXiv trackbacks on the, rather spurious, reason that he supposedly doesn’t count as “active researcher.” The paper tells the full 2006 story with lots of quotes from bloggers you are probably familiar with.

The arXiv recently conducted a user survey, among other things about the trackback feature, which makes me think they might have some updates planned.

On the question who counts as crackpot, the paper (unsurprisingly) doesn’t come to a conclusion other than noting that scientists deal with the issue by stating “we know one when we see one.” I don’t think there can be any other definition than that. To me the notion of “crackpot” is an excellent example of an emergent feature – it’s a demarcation that the community creates during its operation. Any attempt to come up with a definition from first principles is hence doomed to fail.

The rest of the paper is a general discussion of the role of blogs in science communication, but I didn’t find it particularly insightful. The author comes to the (correct) conclusion that blog content turned out not to have such a short life-time as many feared, but otherwise basically just notes that there are as many ways to use blogs as there are bloggers. But then if you are reading this, you already knew that.

One of the main benefits that I see in blogs isn’t mentioned in the paper at all, which is that blogs supports communication between scientific communities that are only loosely connected. In my own research area, I read the papers, hear the seminars, and go to conferences, and I therefore know pretty well what is going on – with or without blogs. But I use blogs to keep up to date in adjacent fields, like cosmology, astrophysics and, to a lesser extent, condensed matter physics and quantum optics. For this purpose I find blogs considerably more useful than popular science news, because the latter often doesn’t provide a useful amount of detail and commentary, not to mention that they all tend to latch onto the same three papers that made big unsubstantiated claims.

Don’t worry, I haven’t suddenly become obsessed with string theory. I’ve read through these sociology papers mainly because I cannot not write a few paragraphs about the topic in my book. But I promise that’s it from me about string theory for some while.

Update: Peter Woit has some comments on the trackback issue.

Thursday, May 19, 2016

The Holy Grail of Crackpot Filtering: How the arXiv decides what’s science – and what’s not.

Where do we draw the boundary between science and pseudoscience? It’s is a question philosophers have debated for as long as there’s been science – and last time I looked they hadn’t made much progress. When you ask a sociologist their answer is normally a variant of: Science is what scientists do. So what do scientists do?

You might have heard that scientists use what’s called the scientific method, a virtuous cycle of generating and testing hypotheses which supposedly separates the good ideas from the bad ones. But that’s only part of the story because it doesn’t tell you where the hypotheses come from to begin with.

Science doesn’t operate with randomly generated hypotheses for the same reason natural selection doesn’t work with randomly generated genetic codes: it would be highly inefficient and any attempt to optimize the outcome would be doomed to fail. What we do instead is heavily filtering hypotheses, and then we consider only those which are small mutations of ideas that have previously worked. Scientists like to be surprised, but not too much.

Indeed, if you look at the scientific enterprise today, almost all of its institutionalized procedures are methods not for testing hypotheses, but for filtering hypotheses: Degrees, peer reviews, scientific guidelines, reproduction studies, measures for statistical significance, and community quality standards. Even the use of personal recommendations works to that end. In theoretical physics in particular the prevailing quality standard is that theories need to be formulated in mathematical terms. All these are requirements which have evolved over the last two centuries – and they have proved to work very well. It’s only smart to use them.

But the business of hypotheses filtering is a tricky one and it doesn’t proceed by written rules. It is a method that has developed through social demarcation, and as such it has its pitfalls. Humans are prone to social biases and every once in a while an idea get dismissed not because it’s bad, but because it lacks community support. And there is no telling how often this happens because these are the stories we never get to hear.

It isn’t news that scientists lock shoulders to defend their territory and use technical terms like fraternities use secret handshakes. It thus shouldn’t come as a surprise that an electronic archive which caters to the scientific community would develop software to emulate the community’s filters. And that is, in a nutshell, basically what the arXiv is doing.

In an interesting recent paper, Luis Reyes-Galindo had a look at the arXiv moderators and their reliance on automated filters:


In the attempt to develop an algorithm that would sort papers into arXiv categories automatically, thereby supporting arXiv moderators to decide when a submission needs to be reclassified, it turned out that papers which scientists would mark down as “crackpottery” showed up as not classifiable or stood out by language significantly different from that in the published literature. According to Paul Ginsparg, who developed the arXiv more than 20 years ago:
“The first thing I noticed was that every once in a while the classifier would spit something out as ‘I don't know what category this is’ and you’d look at it and it would be what we’re calling this fringe stuff. That quite surprised me. How can this classifier that was tuned to figure out category be seemingly detecting quality?

“[Outliers] also show up in the stop word distribution, even if the stop words are just catching the style and not the content! They’re writing in a style which is deviating, in a way. [...]

“What it’s saying is that people who go through a certain training and who read these articles and who write these articles learn to write in a very specific language. This language, this mode of writing and the frequency with which they use terms and in conjunctions and all of the rest is very characteristic to people who have a certain training. The people from outside that community are just not emulating that. They don’t come from the same training and so this thing shows up in ways you wouldn’t necessarily guess. They’re combining two willy-nilly subjects from different fields and so that gets spit out.”
It doesn’t surprise me much – you can see this happening in comment sections all over the place: The “insiders” can immediately tell who is an “outsider.” Often it doesn’t take more than a sentence or two, an odd expression, a term used in the wrong context, a phrase that nobody in the field would ever use. It is only consequential that with smart software you can tell insiders from outsiders even more efficiently than humans. According to Ginsparg:
“We've actually had submissions to arXiv that are not spotted by the moderators but are spotted by the automated programme [...] All I was trying to do is build a simple text classifier and inadvertently I built what I call The Holy Grail of Crackpot Filtering.”
Trying to speak in the code of a group you haven’t been part of at least for some time is pretty much impossible, much like it’s impossible to fake the accent of a city you haven’t lived in for some while. Such in-group and out-group demarcation is subject of much study in sociology, not specifically the sociology of science, but generally. Scientists are human and of course in-group and out-group behavior also shapes their profession, even though they like to deny it as if they were superhuman think-machines.

What is interesting about this paper is that, for the first time, it openly discusses how the process of filtering happens. It’s software that literally encodes the hidden rules that physicists use to sort out cranks. For what I can tell, the arXiv filters work reasonably well, otherwise there would be much complaint in the community. But the vast majority of researchers in the field are quite satisfied with what the arXiv is doing, meaning the arXiv filters match their own judgement.

There are exceptions of course. I have heard some stories of people who were working on new approaches that fell between the stools and were flagged as potential crackpottery. The cases that I know of could eventually be resolved, but that might tell you more about the people I know than about the way such issues typically end.

Personally, I have never had a problem with the arXiv moderation. I had a paper reclassified from gen-ph to gr-qc once by a well-meaning moderator, which is how I learned that gen-ph is the dump for borderline crackpottery. (How would I have known? I don’t read gen-ph. I was just assuming someone reads it.)

I don’t so much have an issue with what gets filtered on the arXiv, what bothers me much more is what does not get filtered and hence, implicitly, gets approval by the community. I am very sympathetic to the concerns of John The-End-Of-Science Horgan that scientists don’t clean enough on their own doorsteps. There is no “invisible hand” that corrects scientists if they go astray. We have to do this ourselves. In-group behavior can greatly misdirect science because, given sufficiently many people, even fruitless research can become self-supportive. No filter that is derived from the community’s own judgement will do anything about this.

It’s about time that scientists start paying attention to social behavior in their community. It can, and sometimes does, affect objective judgement. Ignoring or flagging what doesn’t fit into pre-existing categories is one such social problem that can stand in the way of progress.

In a 2013 paper published in Science, a group of researchers quantified the likeliness of combinations of topics in citation lists and studied the cross-correlation with the probability of the paper becoming a “hit” (meaning in the upper 5th percentile of citation scores). They found that having previously unlikely combinations in the quoted literature is positively correlated with the later impact of a paper. They also note that the fraction of papers with such ‘unconventional’ combinations has decreased from 3.54% in the 1980s to 2.67% in the 1990, “indicating a persistent and prominent tendency for high conventionality.”

Conventional science isn’t bad science. But we also need unconventional science, and we should be careful to not assign the label “crackpottery” too quickly. If science is what scientists do, scientists should pay some attention to the science of what they do.

Monday, February 15, 2016

What makes an idea worthy? An interview with Anthony Aguirre

That science works merely by testing hypotheses has never been less true than today. As data have become more precise and theories have become more successful, scientists have become increasingly careful in selecting hypotheses before even putting them to test. Commissioning an experiment for every odd idea would be an utter waste of time, not to mention money. But what makes an idea worthy?

Pre-selection of hypotheses is especially important in fields where internal consistency and agreement with existing data are very strong constraints already, and it therefore plays an essential role in the foundation of physics. In this area, most new hypotheses are born dead or die very quickly, and researchers would rather not waste time devising experimental tests for ill-fated non-starters. During their career, physicists must thus constantly decide whether a new ideas justifies spending years of research on it. Next to personal interest, their decision criteria are often based on experience and community norms – past-oriented guidelines that reinforce academic inertia.

Philosopher Richard Dawid coined the word “post-empirical assessment” for the practice of hypotheses pre-selection, and described it as a non-disclosed Bayesian probability estimate. But philosophy is one thing, doing research another thing. For the practicing scientist, the relevant question is whether a disclosed and organized pre-selection could help advance research. This would require the assessment to be performed in a cleaner way than is presently the case, a way that is less prone to error induced by social and cognitive biases.

One way to achieve this could be to give researchers incentives for avoiding such biases. Monetary incentives are a possibility, but to convince a scientist that their best path of action is putting aside the need to promote their own research would mean incentives totaling research grants for several years – an amount that adverts on nerd pages won’t raise, and thus an idea that seems one of these ill-fated non-starters. But then for most scientists their reputation is more important than money.

Anthony Aquirre.
Image Credits: Kelly Castro.
And so Anthony Aquirre, Professor of Physics at UC Santa Cruz, devised an algorithm by which scientists can estimate the chances that an idea succeeds, and gain reputation by making accurate predictions. On his website Metaculus, users are asked to evaluate the likelihood of success for various scientific and technological developments. In the below email exchange, Antony explains his idea.

Bee: Last time I heard from you, you were looking for bubble collisions as evidence of the multiverse. Now you want physicists to help you evaluate the expected impact of high-risk/high-reward research. What happened?

Anthony: Actually, I’ve been thinking about high-risk/high-reward research for longer than bubble collisions! The Foundational Questions Institute (FQXi) is now in its tenth year, and from the beginning we’ve seen part of FQXi’s mission as helping to support the high-risk/high-reward part of the research funding spectrum, which is not that well-served by the national funding agencies. So it’s a long-standing question how to best evaluate exactly how high-risk and high-reward a given proposal is.

Bubble collisions are actually a useful example of this. It’s clear that seeing evidence of an eternal-inflation multiverse would be pretty huge news, and of deep scientific interest. But even if eternal inflation is right, there are different versions of it, some of which have bubble and some of which don’t; and even of those that do, only some subset will yield observable bubble collisions. So: how much effort should be put into looking for them? A few years of grad student or postdoc time? In my opinion, yes. A dedicated satellite mission? No way, unless there were some other evidence to go on.

(Another lesson, here, in my opinion, is that if one were to simply accept the dismissive “the multiverse is inherently unobservable” critique, one would never work out that bubble collisions might be observable in the first place.)

B: What is your relation to FQXi?

A: Max Tegmark and I started FQXi in 2006, and have had a lot of fun (and only a bit of suffering!) trying to build something maximally useful to community of people thinking about the type of foundational, big-picture questions we like to think about.

B: What problem do you want to address with Metaculus?

Predicting and evaluating (should “prevaluating” be a word?) science research impact was actually — for me — the second motivation for Metaculus. The first grew out of another nonprofit I helped found, the Future of Life Institute (FLI). A core question there is how major new technologies like AI, genetic engineering, nanotech, etc., are likely to unfold. That’s a hard thing to know, but not impossible to make interesting and useful forecasts for.

FLI and organizations like it could try to build up a forecasting capability by hiring a bunch of researchers to do that. But I wanted to try something different: to generate a platform for soliciting and aggregating predictions that — with enough participation and data generation — could make accurate and well-calibrated predictions about future technology emergence as well as a whole bunch of other things.

As this idea developed, my collaborators (including Greg Laughlin at UCSC) and I realized that it might also be useful in filling a hole in our community’s ability to predict the impact of research. This could in principle help make better decisions about questions ranging from the daily (“Which of these 40 papers in my “to read” folder should I actually carefully read”) to the large-scale (“Should we fund this $2M experiment on quantum cognition?”).

B: How does Metaculus work?

The basic structure is of a set of (currently) binary questions about the occurrence of future events, ranging from predictions about technologies like self-driving cars, Go-playing AIs and nuclear fusion, to pure science questions such as the detection of Planet 9, publication of experiments in quantum cognition or tabletop quantum gravity, or announcement of the detection of gravitational waves.

Participants are invited assess the likelihood (1%-99%) of those events occurring. When a given question ‘resolves’ as either true or false, points are award depending upon a user's prediction, the community’s predictions, and what actually happened. These points add a competitive game aspect, but serve a more important purpose of providing steady feedback so that predictors can learn how to predict more accurately, and with better calibration. As data accumulations, predictors will also amass a track record, both overall and in particular subjects. This can be used to aggregate predictions into a single, more accurate, one (at the moment, the ‘community’ predictions is just a straight median).

An important aspect of this, I think is not ‘just’ to make better predictions about well-known questions, but to create lots and lots of well-posed questions. It really does make you think about things differently when you have to come up with a well-posed question that has a clear criterion for resolution. And there are lots of questions where even a few predictions (even one!) by the right people can be a very useful resource. So a real utility is for this to be a sort of central clearing-house for predictions.

B: What is the best possible outcome that you can imagine from this website and what does it take to get there?

A: The best outcome I could imagine would be this becoming really large-scale and useful, like a Wikipedia or Quora for predictions. It would also be a venue in which the credibility to make pronouncements about the future would actually be based on one’s actual demonstrated ability to make good predictions. There is, sadly, nothing like that in our current public discourse, and we could really use it.

I’d also be happy (if not as happy) to see Metaculus find a more narrow but deep niche, for example in predicting just scientific research/experiment success, or just high-impact technological rollouts (such as AI or Biotech).

In either case, it will take continued steady growth of both the community of users and the website’s capabilities. We already have all sorts of plans for multi-outcome questions, contingent questions, Bayes nets, algorithms for matching questions to predictors, etc. — but that will take time. We also need feedback about what users like, and what they would like the system to be able to do. So please try it out, spread the word, and let us know what you think!

Thursday, January 28, 2016

Does the arXiv censor submissions?

The arXiv is the physicsts' marketplace of ideas. In high energy physics and adjacent fields, almost all papers are submitted to the arXiv prior to journal submission. Developed by Paul Ginsparg in the early 1990s, this open-access pre-print repository has served the physics community for more than 20 years, and meanwhile extends also to adjacent fields like mathematics, economics, and biology. It fulfills an extremely important function by helping us to exchange ideas quickly and efficiently.

Over the years the originally free signup became more restricted. If you sign up for the arXiv now, you need to be "endorsed" by several people who are already signed up. It also became necessary to screen submissions to keep the quality level up. In hindsight, this isn't surprising: more people means more trouble. And sometimes, of course, things go wrong.

I have heard various stories about arXiv moderation gone wrong, mostly these are from students, and mostly it affects those who work in small research areas or those whose name is Garrett Lisi.

A few days ago, a story appeared online which quickly spread. Nicolas Gisin, an established Professor for Physics who works on quantum cryptography (among other things) relates the story of two of his students who ventured in a territory unfamiliar for him, black hole physics. They wrote a paper that appeared to him likely wrong but reasonable. It got rejected by the arxiv. The paper later got published by PLA (a respected journal that however does not focus on general relativity). More worrisome still, the students' next paper also got rejected by the arXiv, making it appear as if they were now blacklisted.

Now the paper that caused the offense is, haha, not on the arXiv, but I tracked it down. So let me just say that I think it's indeed wrong and it shouldn't have gotten published in a journal. They are basically trying to include the backreaction of the outgoing Hawking-radiation on the black hole. It's a thorny problem (the very problem this blog was named after) and the treatment in the paper doesn't make sense.

Hawking radiation is not produced at the black hole horizon. No, it is not. And tracking back the flux from infinity to the horizon is therefore is not correct. Besides this, the equation for the mass-loss that they use is a late-time approximation in a collapse situation. One can't use this approximation for a metric without collapse, and it certainly shouldn't be used down to the Planck mass. If you have a collapse-scenario, to get the backreaction right you would have to calculate the emission rate prior to horizon formation, time-dependently, and integrate over this.

Ok, so the paper is wrong. But should it have been rejected by the arXiv? I don't think so. The arxiv moderation can't and shouldn't replace peer review, it should just be a basic quality check, and the paper looks like a reasonable research project.

I asked a colleague who I know works as an arXiv moderator for comment. (S)he wants to stay anonymous but offers the following explanation:


I had not heard of the complaints/blog article, thanks for passing that information on...  
 The version of the article I saw was extremely naive and was very confused regarding coordinates and horizons in GR... I thought it was not “referee-able quality’’ — at least not in any competently run GR journal... (The hep-th moderator independently raised concerns...)  
 While it is now published at Physics Letters A, it is perhaps worth noting that the editorial board of Physics Letters A does *not* include anyone specializing in GR.
(S)he is correct of course. We haven't seen the paper that was originally submitted. It was very likely in considerably worse shape than the published version. Indeed, Gisin writes in his post that the paper was significantly revised during peer review. Taking this into account, the decision seems understandable to me.

The main problem I have with this episode is not that a paper got rejected which maybe shouldn't have been rejected -- because shit happens. Humans make mistakes, and let us be clear that the arXiv, underfunded as it is, relies on volunteers for the moderation. No, the main problem I have is the lack of transparency.

The arXiv is an essential resource for the physics community. We all put trust in a group of mostly anonymous moderators who do a rather thankless and yet vital job. I don't think the origin of the problem is with these people. I am sure they do the best they can. No, I think the origin of the problem is the lack of financial resources which must affect the possibility to employ administrative staff to oversee the operations. You get what you pay for.

I hope that this episode be a wake-up call to the community to put their financial support behind the arXiv, and to the arXiv to use this support to put into place a more transparent and better organized moderation procedure.

Note added: It was mentioned to me that the problem with the paper might be more elementary in that they're using wrong coordinates to begin with - it hadn't even occurred to me to check this. To tell you the truth, I am not really interested in figuring out exactly why the paper is wrong, it's besides the point. I just hope that whoever reviewed the paper for PLA now goes and sits in the corner for an hour with a paper bag over their head.

Tuesday, October 21, 2014

We talk too much.

Image Source: Loom Love.

If I had one word to explain human culture at the dawn of the 21st century it would be “viral”. Everybody, it seems, is either afraid of or trying to make something go viral. And as mother of two toddlers in Kindergarten, I am of course well qualified to comment on the issue of spreading diseases, like pinkeye, lice, goat memes, black hole firewalls, and other social infections.

Today’s disease is called rainbow loom. It spreads via wrist bands that you are supposed to crochet together from rubber rings. Our daughters are too young to crochet, but that doesn’t prevent them from dragging around piles of tiny rubber bands which they put on their fingers, toes, clothes, toys, bed posts, door knobs and pretty much everything else. I spend a significant amount of my waking hours picking up these rubber bands. The other day I found some in the cereal box. Sooner or later, we’ll accidentally eat one.

But most of the infections the kids bring home are words and ideas. As of recently, they call me “little fart” or “old witch” and, leaving aside the possibility that this is my husband’s vocabulary when I am away, they probably trade these expressions at Kindergarten. I’ll give you two witches for one fart, deal? Lara, amusingly enough, sometimes confuses the words “ass” and “men” – “Arch” and “Mench” in German with her toddler’s lisp. You’re not supposed to laugh, you’re supposed to correct them. It’s “Arsch,” Lara, “SCH, not CH, Arsch.”

Man, as Aristotle put it, is a zoon politicon, she lives in communities, she is social, she shares, she spreads ideas and viruses. He does too. I pass through Frankfurt international airport on the average once per week. Research shows that the more often you are exposed to a topic the more important do you think it is, regardless of what the source is. It’s the repeated exposure that does it. Once you have a word in your head marked as relevant, your brain keeps pushing it around and hands it back to you to look for further information. Have I said Ebola yet?

Yes, words and ideas, news and memes, go viral, spread, mutate and affect the way we think. And the more connected we are, the more we share, the more we become alike. We see the same things and talk about the same things. Because if you don’t talk about what everybody else talks about would you even listen to yourself?

Not so surprisingly then, it has become fashionable to declare the end of individualism also in science, pointing towards larger and larger collaborations, and increasing co-author networks, the need to share, and the success of sharing. According to this NYT headline, the “ERA OF BIG SCIENCE DIMINISHES ROLE OF LONELY GENIUS”. We can read there
“Born out of the complexity of modern technology, the era of the vast, big-budget research team came into its own with its scientific achievements of 1984.”
Yes, that’s right, this headline dates back 30 years.

There lonely genius of course has always been a myth. Science is and has always been a community enterprise. We’re standing on the shoulders of giants. Most of them are dead, ok, but we’re still standing, standing on these dead people’s shoulders and we’re still talking and talking and talking. We’re all talking way too much. It’s hard not to have this impression after attending 5 conferences more or less in a row.

Collaboration is very en vogue today, or “trending” as we now say. Nature recently had an article about the measurement of the gravitational constant, G. Not a topic I care deeply about, but the article has an interesting quote:
“Until now, scientists measuring G have competed; everyone necessarily believes in their own value, says Stephan Schlamminger, an experimental physicist at NIST. “A lot of these people have pretty big egos, so it may be difficult,” he says. “I think when people agree which experiment to do, everyone wants their idea put forward. But in the end it will be a compromise, and we are all adults so we can probably agree.” 
Working together could even be a stress reliever, says Jens Gundlach, an experimental physicist at the University of Washington in Seattle. Getting a result that differs from the literature is very uncomfortable, he says. “You think day and night, ‘Did I do everything right?’”
And here I was thinking that worrying day and night about whether you did everything right is the essence of science. But apparently that’s too much stress. It’s clearly better we all work together to make this stressful thinking somebody else’s problem. Can you have a look at my notes and find that missing sign?

The Chinese, as you have almost certainly read, are about to overtake the world, and in that effort they now reform their science research system. Nature magazine informs us that the idea of this reform is “to encourage scientists to collaborate on fewer, large problems, rather than to churn out marginal advances in disparate projects that can be used to seek multiple grants. “Teamwork is the key word,” says Mu-Ming Poo, director of the CAS Institute of Neuroscience in Shanghai.” Essentially, it seems, they’re giving out salary increases for scientists to think the same as their colleagues.

I’m a miserable cook. My mode of operation is taking whatever is in the fridge, throwing it into a pan with loads of butter, making sure it’s really dead, and then pouring salt over it. (So you don’t notice the rubber bands.) Yes, I’m a miserable cook. But I know one thing about cooking: if you cook it for too long or stir too much, all you get is mush. It’s the same with ideas. We’re better off with various individual approaches than one collaborative one. Too much systemic risk in putting all your eggs in the same journal.

The kids, they also bring home sand-bathed gummy bears that I am supposed to wash, their friend’s socks, and stacks of millimeter paper glued together because GLUE! Apparently some store donated cubic meters of this paper to the Kindergarten because nobody buys it anymore. I recall having to draw my error bars on this paper, always trying not to use an eraser because the grid would rub away with the pencil. Those were the days.

We speak about ideas going viral, but we never speak about what happens after this. We get immune. The first time I heard about the Stückelberg mechanism I thought it was the greatest thing ever. Now it’s on the daily increasing list of oh-yeah-this-thing. I’ve always liked the myth of the lonely genius. I have a new office mate. She is very quiet.

Tuesday, August 12, 2014

Do we write too many papers?

Every Tuesday, when the weekend submissions appear on the arXiv, I think we’re all writing too many papers. Not to mention that we work too often on weekends. Every Friday, when another week has passed in which nobody solved my problems for me, I think we’re not writing enough papers.

The Guardian recently published an essay by Timo Hannay, titled “Stop the deluge of science research”, though the URL suggests the original title was “Why we should publish less Scientific Research.” Hannay argues that the literature has become unmanageable and that we need better tools to structure and filter it so that researchers can find what they are looking for. Ie, he doesn’t actually say we should publish less. Of course we all want better boats to stay afloat on the information ocean, but there are other aspects to the question whether we publish too many papers that Hannay didn’t touch upon.

Here, I use “too much” to mean that the amount of papers hinders scientific progress and no longer benefits it. The actual number depends very much on the field and its scientific culture and doesn’t matter all that much. Below I’ve collected some arguments that speak for or against the “too much papers” hypothesis.

Yes, we publish too many papers!
  • Too much to read, even with the best filter. The world doesn’t need to know about all these incremental steps, most of which never lead anywhere anyway.
  • Wastes the time of scientists who could be doing research instead. Publishing several short papers instead of one long one adds the time necessary to write several introductions and conclusions, adapt the paper to different journals styles, fight with various sets of referees, just to then submit the paper to another journal and start all over again.
  • Just not reading them isn’t an option because one needs to know what’s going on. That creates a lot of headache, especially for newcomers. Better only publish what’s really essential knowledge.
  • Wastes the time of editors and referees. Editors and referees typically don’t have access to reports on manuscripts that follow-up works are based on.
No, we don’t publish too many papers!
  • If you think it’s too much, then just don’t read it.
  • If you think it’s too much, you’re doing it wrong. It’s all a matter of tagging, keywords, and search tools.
  • It’s good to know what everybody is doing and to always be up to date.
  • Journals make money with publishing our papers, so don’t worry about wasting their time.
  • Who really wants to write a referee report for one of these 30 pages manuscripts anyway?
Possible reasons that push researchers to publish more than is good for progress:
  • Results pressure. Scientists need published papers to demonstrate outcome of research they received grants for.
  • CV boosting. Lots of papers looks like lots of ideas, at least if one doesn’t look too closely. (Especially young postdocs often believe they don’t have enough papers, so let me add a word of caution. Having too many papers can also work against you because it creates the appearance that your work is superficial. Aim at quality, not quantity.)
  • Scooping angst. In fields which are overpopulated, like for example hep-th, researchers publish anything that might go through just to have a time-stamp that documents they were first.
  • Culture. Researchers adapt the publishing norms of their peers and want to live up to their expectations. (That however might also have the result that they publish less than is good for progress, depending on the prevailing culture of the field.)  
  • PhD production machinery. It’s becoming the norm at least in physics that PhD students already have several publications, typically with their PhD supervisor. Much of this is to make it easier for the students to find a good postdoc position, which again falls back positively on the supervisor. This all makes the hamster wheel turn faster and faster.
All together I don’t have a strong opinion on whether we’re publishing too much or not. What I do find worrisome though is that all these measures for scientific success reduce our tolerance for individuality. Some people write a lot, some less so. Some pay a lot of attention to detail, some rely more on intuition. Some like to discuss and get feedback early to sort out their thoughts, some like to keep their thoughts private until they’ve sorted them out themselves. I think everybody should do their research the way it suits them best, but unfortunately we’re all increasingly forced to publish at rates close to the field average. And who said that the average is the ideal?

Monday, March 17, 2014

Do scientists deliberately use technical expressions so they cannot be understood?

Secret handshake?
Science or gibberish?
“[E]xisting pseudorandom and introspective approaches use pervasive algorithms to create compact symmetries. The development of interrupts would greatly amplify Byzantine fault tolerance. We construct a novel method for the investigation of online algorithms.”

“[T]he effective diminution of the relevant degrees of freedom in the ultraviolet (on which morally speaking all approaches agree) is interpreted as universality in the statistical physics sense in the vicinity of an ultraviolet renormalization group fixed point. The resulting picture of microscopic geometry is fractal-like with a local dimensionality of two.”
IEEE and Springer recently withdrew 120 papers that turned out to be random generated nonsense and Schadenfreude spread among the critics of commercial academic publishing. The internet offers a wide variety of random text generators, including the one used to create the now withdrawn Springer papers, called SciGen. The difficult part of creating random academic text is the grammar, not the vocabulary. If you start with a grammatically correct sentence it is easy enough to fill in technical language.

Take as example the above sentence
“The difficult part of creating random text is the grammar, not the vocabulary.”
And just replace some nouns and adverbs:
“The difficult part of creating completely antisymmetric turbulence is the higher order correction, not the parametric resonance.”
Or maybe
“The difficult part of creating parametric turbulence is the completely antisymmetric resonance, not the higher order correction.”
Sounds very educated, yes? I have some practice with that ;o)The problem is that if you don’t know the technical terms you can’t tell if the relations implied by the grammar make sense. There is thus, not so surprisingly, a long history of cynics abusing this narrow target group of academic writing, and this cynicism spreads rapidly now that academic writing has become more widely available. With the open access movement there swells the background choir chanting that availability isn’t the same as accessibility. Nicholas Kristof recently complained about academic writing in an NYT op-ed:
“[A]cademics seeking tenure must encode their insights into turgid prose. As a double protection against public consumption, this gobbledygook is then sometimes hidden in obscure journals — or published by university presses whose reputations for soporifics keep readers at a distance.”
Kristof calls upon academics to better communicate with the public, which I certainly support. At the same time however he also claims professional language is unnecessary and deliberately exclusive:
“Ph.D. programs have fostered a culture that glorifies arcane unintelligibility while disdaining impact and audience. This culture of exclusivity is then transmitted to the next generation through the publish-or-perish tenure process.”
Let me take these two issues apart. First deliberately exclusive, and second unnecessary.

Steve Fuller, who is a professor for Social Epistemology at the University of Warwick, argues (for example in his book “Knowledge Management Foundations”) that the value of knowledge is related to the scarcity of access to it. For that reason, academics have an incentive to put hurdles in the way of those wanting to get into the ivory tower and make it more difficult than it has to be. It is a good argument, though it is hard to tell how much of this exclusivity is deliberate. At least when it comes to my colleagues in math and physics, the exclusivity seems more a matter of neglect than of intent. Inclusivity takes effort and most academics don’t make this effort.

This brings me to the argument that academic slang is unnecessary. Unfortunately, this is a very common belief. For example, in reaction to my recent post about the tug-of-war between accuracy and popularity in science journalism, several journalists remarked that surely I must have meant precision rather than accuracy, because good journalism can be accurate even though it avoids technical language.

But no, I did in fact mean accuracy. If you don’t use the technical language, you’re not accurate. The whole raison d’être [entirely unnecessary French expression meaning “reason for existence”] of professional terminology is that it is the most accurate description available. And PhD programs don’t “glorify unintelligible gibberish”, they prepare students to communicate accurately and efficiently with their colleagues.

For physicists the technical language is equations, the most important ones carry names. If you want to avoid naming the equation, you inevitably lose accuracy.

The second Friedmann equation, for example, does not just say the universe undergoes accelerated expansion with the present values of dark matter and dark energy, which is a typical “non-technical” description of this relation. The equation also tells you that you’re dealing with a differentiable, metric manifold of dimension 4 and Lorentzian signature and are within Einstein’s theory of general relativity. It tells you that you’ve made an assumption of homogeneity and isotropy. It tells you exactly how the acceleration relates to the matter content. And constraining the coupling constants for certain Lorentz-invariance violating operators of order 5 is not the same as testing “space-time graininess” or testing whether the universe is a computer simulation, to just name some examples.

These details are both irrelevant and unintelligible for the average reader of a pop sci article, I agree. But, I insist, without these details the explanation is not accurate, and not useful for the professional.

Technical terminology is an extremely compressed code that carries a large amount of information for those who have learned to decipher it. It is used in academia because without compression nobody could write, let alone read, a paper. You’d have to attach megabytes worth of textbooks, lectures and seminars.

In science, most terms are cleanly defined, others have various definitions and some I admit are just not well-defined. In the soft sciences, the situation is considerably worse. In many cases trying to pin down the exact meaning of an -ism or -ology opens a bottomless pit of various interpretations and who-said-whats that date back thousands of years. This is why my pet peeve is to discard soft science arguments as useless due to undefined terminology. However, one can’t really blame academics in these disciplines – they are doing the best they can building castles on sand. But regardless of whether their terminology is very efficient or not compared to the hard sciences, it too is used for the sake of compression.

So no, academic slang is not unnecessary. But yes, academic language is exclusive as a consequence of this. It is in that not different from other professions. Just listen to your dentist and her assistant discuss their tools and glues, or look at some car-fanatics forum, and you’ll find the same exclusivity there. The difference is gradual and lies in the amount of time you need to invest to be one of them, to learn their language.

Academic language is not purposefully designed to exclude others, but it arguably serves this purpose once in place. Pseudoscientists tend to underestimate just how obvious their lack of knowledge is. It often takes a scientist not more than a sentence to recognize an outsider as such. Are you be able to tell the opening sentences of this blogpost from gibberish? Can you tell the snarxiv from the arxiv?

Indeed, it is in reality not the PhD that marks the science-insider from the outsider. The PhD defense is much like losing your virginity, vastly overrated. It looms big in your future, but once in the past you note that nobody gives a shit. You mark your place in academia not by hanging a framed title on your office door, but by using the right words at the right place. Regardless of whether you do have a PhD, you’ll have to demonstrate the knowledge equivalent of a PhD to become an insider. And there’s no shortcuts to this.

For scientists this demarcation is of practical use because it saves them time. On the flipside, there is the occasional scientist who goes off the deep end and who then benefits from having learned the lingo to make nonsense sound sophisticated. However, compared to the prevalence of pseudoscience this is a rare problem.

Thus, while the exclusivity of academic language has beneficial side effects, technical expressions are not deliberately created for the purpose of excluding others. They emerge and get refined in the community as efficient communication channels. And efficient communication inside a discipline is simply not the same as efficient communication with other disciplines or with the public, a point that Kristof in his op-ed is entirely ignoring. Academics are hired and get paid for communicating with their colleagues, not with the public. That is the main reason academic writing is academic. There is probably no easy answer to just why it has come to be that academia doesn’t make much effort communicating with the public. Quite possibly Fuller has a point there in that scarcity of access protects the interests of the communities.

But leaving aside the question of where the problem originates, at prima facie [yeah, I don’t only know French, but also Latin] the reason most academics are bad at communicating with the public is simple: They don’t care. Academia presently very strongly selects for single-minded obsession with research. Communicating with the public, about one’s own research or to chime in with opinions on scientific policy, it is in the best case useless in the worst case harmful to do the job that pays their rent. Accessibility and popularity does for academics not convert into income, and even an NYT Op-Ed isn’t going to change anything about this. The academics you find in the public sphere are primarily those who stand to benefit from the limelight: Directors and presidents of something spreading word about their institution, authors marketing their books, and a few lucky souls who found a way to make money with their skills and gigs. You do not find the average academic making an effort to avoid academic prose because they have nothing to gain with that.

I’ve read many flowery words about how helpful science communication – writing for the public, public lectures, outreach events, and so on – can be to make oneself and one’s research known. Yes, can be, and anecdotally this has helped some people find good jobs. But this works out so rarely that on the average it is a bad investment of time. That academics are typically overworked and underpaid anyway doesn’t help. That’s not good, but that’s reality.

I certainly wish more academics would engage with the public and make that effort of converting academic slang to comprehensible English, but knowing how hard my colleagues work already, I can’t blame them for not doing so. So please stop complaining that academics do what they were hired to do and that they don’t work for free on what doesn’t feed their kids. If you want more science communication and less academic slang, put your money where your mouth is and pay those who make that effort.

The first of the examples at the top of this post is random nonsense generated with SciGen. The second example is from the introduction of the Living Review on Asymptotic Safety. Could you tell?

Sunday, January 19, 2014

Trouble in the Ivory Tower: Not an academic problem.

The Ivory Tower.
Image from The Neverending Story.
“Science is the only news,” Stuart Brand told us. And the news is that research misconduct is on the rise while reproducible results are in decline. Peer review, the process in which scientific publications are evaluated by anonymous peers, has become a farce as scientists’ existential worries make it an exercise in forward defense with the occasional backhand offense. Scientists produce more papers now than ever, and then hide them behind journal subscriptions so costly nobody can read them – a good idea because most published research findings are probably false, though that too is probably false. Measures for scientific success have been criticized ever since they began being used, and the academic system chokes on social effects like herding, pluralistic ignorance and groupthink.

Yes, science works, no need to call me names. But science doesn’t work as good as it could, not as good as it should, not as good as we need it to work.

Scientific institutions and scientific management are stuck in the last century. The academic system today is in no shape to cope with the demands of high connectivity in a global and increasing workforce, is unable to deal with complex trans-national and interdisciplinary problems, and can’t handle the amplification of social feedback that information technology has brought.

The academic system, in brief, has the same problem as our political, social and economic systems.

The biggest challenge mankind faces today is not the development of some breakthrough technology. The biggest challenge is to create a society whose institutions integrate the knowledge that must precede any such technology, including knowledge about these institutions themselves. All of our big problems today speak of our failure, not to envision solutions, but to turn our ideas and knowledge into reality.

It’s not that we lack creativity. It’s that the kind of creativity that comes to us naturally does not latch upon problems evolution didn’t endow us to register to begin with. We do not comprehend the interplay of large crowds of people and are unable to individually beat our own psychology, rooted in groups of tens to hundreds, not billions. To arrange our living together in groups larger than we can intuit, we agree on rules of conduct and incentives that align our individual actions with collective trends so that both are to our benefit. This requires systems design. It requires science. And before that it requires we acknowledge the problem.

But we watch. We watch with bewilderment as a video of sunrise is broadcast on Tiananmen square where thick smog forces onlookers to wear breathing masks. We watch with horrified fascination video footages of the big garbage swirl and of birds dying from indigestible plastic pieces. We watch, hypnotized, replays of negotiation failures that make our adaptation to climate change more costly by the day. The way we have arranged, organized, policed and institutionalized our living together leaves us to watch ourselves watching, stunned at our own inability to change anything about it.

And scientists, the ones who should be able to analyze the situation and to devise a solution aren’t any better.

Scientists, of course, know exactly what is wrong with academia. Leaving aside that no two of them can agree on how to do it, they know how to solve the problem. There’s no shortage of proposals for how to fix peer review and scientific publishing and for how to better distribute resources. Futures markets, auction markets, lottery systems, open peer review, and dozens of alternative metrics have been suggested, we’ve seen it all. They write papers about it and send them for peer review. The rest is the same old he-said-she-said.

So far, scientists miserably failed to adapt the academic system to the changing demands of the 21st century. They belabor the problem and devise solutions, but are unable to implement them. And in the ocean of conference proceedings they watch the giant abstract swirl.

Academia mirrors the problem of our societies in a nutshell. The members of the academe, they’re all talk but no walk. We are being told that scientists are studying now the interconnectivity of the multi-layered networks that govern our societies, and we ask for answers and advice, we ask to be informed about how to solve our problems. There’s nobody else to solve these problems.

Social systems adapt to changing demands much like organisms do, by gradual modification and selection. But this process takes time – a lot of time – and it’s time we cannot afford. The only way to accelerate this adaption is the scientific method: a targeted, controlled, and recorded series of modifications. Many existing projects today aim to track and analyze the complex interactions of our highly interwoven networked world. But not a single one of these projects addresses the real problem, which is how to use this knowledge in the very systems that are being studied. It is this feedback of knowledge about the system back into the system that is necessary for our institutions to adapt. It requires a self-consistent scientific approach to institutional design, an approach that doesn’t exist and is nowhere near existence.

We need scientists to help us create social systems that organize our living together in groups so large that our evolutionary brains, trained to deal with small groups, cannot cope with. Trial and error will take too long and the errors are too costly now. But scientists are like the overweight doctor preaching the benefits of blood-pressure regulation, evidently unable to solve their own problems first. They presently can’t help us solve any problems, and we shouldn’t listen to their advice until they’ve solved their own problems.

Science is the only news, but it’s not only news. It’s the canary in the coal mine. Better watch it closely.