Where do we draw the boundary between science and pseudoscience? It’s is a question philosophers have debated for as long as there’s been science – and last time I looked they hadn’t made much progress. When you ask a sociologist their answer is normally a variant of: Science is what scientists do. So what do scientists do?
You might have heard that scientists use what’s called the scientific method, a virtuous cycle of generating and testing hypotheses which supposedly separates the good ideas from the bad ones. But that’s only part of the story because it doesn’t tell you where the hypotheses come from to begin with.Science doesn’t operate with randomly generated hypotheses for the same reason natural selection doesn’t work with randomly generated genetic codes: it would be highly inefficient and any attempt to optimize the outcome would be doomed to fail. What we do instead is heavily filtering hypotheses, and then we consider only those which are small mutations of ideas that have previously worked. Scientists like to be surprised, but not too much.
Indeed, if you look at the scientific enterprise today, almost all of its institutionalized procedures are methods not for testing hypotheses, but for filtering hypotheses: Degrees, peer reviews, scientific guidelines, reproduction studies, measures for statistical significance, and community quality standards. Even the use of personal recommendations works to that end. In theoretical physics in particular the prevailing quality standard is that theories need to be formulated in mathematical terms. All these are requirements which have evolved over the last two centuries – and they have proved to work very well. It’s only smart to use them.
But the business of hypotheses filtering is a tricky one and it doesn’t proceed by written rules. It is a method that has developed through social demarcation, and as such it has its pitfalls. Humans are prone to social biases and every once in a while an idea get dismissed not because it’s bad, but because it lacks community support. And there is no telling how often this happens because these are the stories we never get to hear.
It isn’t news that scientists lock shoulders to defend their territory and use technical terms like fraternities use secret handshakes. It thus shouldn’t come as a surprise that an electronic archive which caters to the scientific community would develop software to emulate the community’s filters. And that is, in a nutshell, basically what the arXiv is doing.
In an interesting recent paper, Luis Reyes-Galindo had a look at the arXiv moderators and their reliance on automated filters:
- Automating the Horae: Boundary-work in the age of computers
Luis Reyes-Galindo
arXiv:1603.03824 [physics.soc-ph]
In the attempt to develop an algorithm that would sort papers into arXiv categories automatically, thereby supporting arXiv moderators to decide when a submission needs to be reclassified, it turned out that papers which scientists would mark down as “crackpottery” showed up as not classifiable or stood out by language significantly different from that in the published literature. According to Paul Ginsparg, who developed the arXiv more than 20 years ago:
“The first thing I noticed was that every once in a while the classifier would spit something out as ‘I don't know what category this is’ and you’d look at it and it would be what we’re calling this fringe stuff. That quite surprised me. How can this classifier that was tuned to figure out category be seemingly detecting quality?It doesn’t surprise me much – you can see this happening in comment sections all over the place: The “insiders” can immediately tell who is an “outsider.” Often it doesn’t take more than a sentence or two, an odd expression, a term used in the wrong context, a phrase that nobody in the field would ever use. It is only consequential that with smart software you can tell insiders from outsiders even more efficiently than humans. According to Ginsparg:
“[Outliers] also show up in the stop word distribution, even if the stop words are just catching the style and not the content! They’re writing in a style which is deviating, in a way. [...]
“What it’s saying is that people who go through a certain training and who read these articles and who write these articles learn to write in a very specific language. This language, this mode of writing and the frequency with which they use terms and in conjunctions and all of the rest is very characteristic to people who have a certain training. The people from outside that community are just not emulating that. They don’t come from the same training and so this thing shows up in ways you wouldn’t necessarily guess. They’re combining two willy-nilly subjects from different fields and so that gets spit out.”
“We've actually had submissions to arXiv that are not spotted by the moderators but are spotted by the automated programme [...] All I was trying to do is build a simple text classifier and inadvertently I built what I call The Holy Grail of Crackpot Filtering.”Trying to speak in the code of a group you haven’t been part of at least for some time is pretty much impossible, much like it’s impossible to fake the accent of a city you haven’t lived in for some while. Such in-group and out-group demarcation is subject of much study in sociology, not specifically the sociology of science, but generally. Scientists are human and of course in-group and out-group behavior also shapes their profession, even though they like to deny it as if they were superhuman think-machines.
What is interesting about this paper is that, for the first time, it openly discusses how the process of filtering happens. It’s software that literally encodes the hidden rules that physicists use to sort out cranks. For what I can tell, the arXiv filters work reasonably well, otherwise there would be much complaint in the community. But the vast majority of researchers in the field are quite satisfied with what the arXiv is doing, meaning the arXiv filters match their own judgement.
There are exceptions of course. I have heard some stories of people who were working on new approaches that fell between the stools and were flagged as potential crackpottery. The cases that I know of could eventually be resolved, but that might tell you more about the people I know than about the way such issues typically end.
Personally, I have never had a problem with the arXiv moderation. I had a paper reclassified from gen-ph to gr-qc once by a well-meaning moderator, which is how I learned that gen-ph is the dump for borderline crackpottery. (How would I have known? I don’t read gen-ph. I was just assuming someone reads it.)
I don’t so much have an issue with what gets filtered on the arXiv, what bothers me much more is what does not get filtered and hence, implicitly, gets approval by the community. I am very sympathetic to the concerns of John The-End-Of-Science Horgan that scientists don’t clean enough on their own doorsteps. There is no “invisible hand” that corrects scientists if they go astray. We have to do this ourselves. In-group behavior can greatly misdirect science because, given sufficiently many people, even fruitless research can become self-supportive. No filter that is derived from the community’s own judgement will do anything about this.
It’s about time that scientists start paying attention to social behavior in their community. It can, and sometimes does, affect objective judgement. Ignoring or flagging what doesn’t fit into pre-existing categories is one such social problem that can stand in the way of progress.
In a 2013 paper published in Science, a group of researchers quantified the likeliness of combinations of topics in citation lists and studied the cross-correlation with the probability of the paper becoming a “hit” (meaning in the upper 5th percentile of citation scores). They found that having previously unlikely combinations in the quoted literature is positively correlated with the later impact of a paper. They also note that the fraction of papers with such ‘unconventional’ combinations has decreased from 3.54% in the 1980s to 2.67% in the 1990, “indicating a persistent and prominent tendency for high conventionality.”
Conventional science isn’t bad science. But we also need unconventional science, and we should be careful to not assign the label “crackpottery” too quickly. If science is what scientists do, scientists should pay some attention to the science of what they do.








