Tuesday, February 28, 2012

Everything is amazing and nobody writes errata

I am working on a review which had me digging out papers going back to 1920. It is very interesting to see the ideas that were suggested and then found not work, or were forgotten and then rediscovered.

In an objective sense the papers become less readable the older they are. For one, because at the beginning of the century many papers were still written in French, German or Russian, and also the notation and terminology has changed. It adds to this that back then people were discussing problems whose answer we know today, and it can be difficult to follow their trains of thought. And then, there's the physical readability that deteriorates. Printouts of scans, especially in small fonts with toner low, can give me a headache that is not conductive to my attention.

On one scanned paper that I read, an overactive software removed background noise, and in that process also erased all punctuation marks. In the text that was merely annoying, but unfortunately the authors had used dots and primes for derivatives.

However, in a subjective sense the papers seem to be getting less readable the newer they are, and that almost discontinuously. The style of writing has been changing.

Everything written before roughly 1990 is carefully motivated, edited, referenced and explained. One also finds very frequently errata, or constructive comments in the next issue of the journal, which seems to have fallen somewhat out of fashion after that. By the late 1990s, most papers are difficult to understand if one doesn't happen to work on a closely related topic or at least follows it; the motivation is often entirely missing or very narrow, common arguments are omitted and apparently just assumed to be known, variables are never introduced and believed to conform to some standard notation (that in 100 years nobody will recall), technical terms are neither explained nor referenced and yet hardly anybody ever seems to cite the textbooks that would explain them.

Needless to say, this is not the case for all papers, there are exceptions, but by and large that has been my impression. It's not so bad actually when you are familiar with the topic. In fact, I am often relieved if I don't have to read yet another introduction that says the same thing! But it is likely that to the reader not familiar with the topic, which in some decades might be pretty much all readers, the relevance and argumentation remains unclear.

So then I've been wondering why it seems that by the mid 1990 the style in which scientific papers were written changed. Here's some explanations that I came up with:
  1. That's just me. Everybody else thinks the newer a paper the better it is understandable, and people back then only wrote confusing garble.

  2. Selection bias. The old papers that are still cited today, or are at least at the root of citation trees, are the most readable ones.

  3. Specialization. There are many more physicists today than in 1920, and calculations have been getting more complicated. It would however take up too much space to explain all technical details or terminology from scratch, which makes papers increasingly opaque. There is certainly some truth to that, but that doesn't quite explain why this seemed to have happened so suddenly.

  4. Typesetting changes. Stefan pointed out that in the 1990s LaTeX became widely used. Before that, many papers were typed by secretaries on typewriters and the equations put in by hand, then the draft was send by mail. The ease and speed of the process today breads carelessness.

  5. Distribution changes. The pulse of academic exchange has been quickening. Today, researchers don't write to be understood in 100 years, they write to get a job next year. Errata or comments don't work towards that end. They don't add motivations because the people they want to reach share their opinion that the topic is relevant anyway.

Most likely it's a combination of the above. What do you think?

I've been wondering if not the future of the paper is an assembly of building blocks. Why does everybody have to write the motivation or explanation of techniques used all over again? I am thinking in ten years, when you download a paper you can choose an option for the level of detail that you want, and then get the paper customized for the knowledge you bring. That won't always work, but for research fields in stages 3 and 4, it might work quite well.

Partly related:

21 comments:

Igor Khavkine said...

Bee, are you willing to publicly share some examples?

Bee said...

No. I don't want to publicly discuss the writing style of some of my colleagues: it might come off as a criticism rather than just pointing out how the writing style has been changing. If you read the paper that we discussed here (from 1964), then search the arxiv for related topics and randomly read a few, I think you'll see what I mean.

Steve W said...

Editors don't edit anymore. You see plenty of published papers with grammatical errors in the title, or the first paragraph. The editors seem to be solely in charge of sending the damn things out for review, and have abdicated any role in making the material or the paper readable.

Bee said...

Hi Steve,

I don't think it should be the editor's work to correct the spelling and grammar. I know that some journals do at least a basic check with the copy-editing, because I sometimes get corrections with the proofs. However, this is clearly the exception rather than the rule. I suspect it's primarily lack of personnel. Best,

B.

Steve W said...

Book editors do this, and so do magazine editors. Why shouldn't journal editors? After all, that's what editing *is*!

But mainly my point is that there is no systematic demand for anything approaching good writing. If editors, and perhaps referees, do not even comment on basic grammatical errors, what hope is there for the correction of more subtle errors in clarity of exposition? Somebody, somewhere, has to provide feedback.

Perhaps the real problem is that nobody actually reads papers anymore, since they are no longer the primary means by which ideas are communicated.

Bee said...

Hi Steve,

Referees do frequently point out the presence of grammatical errors (the spelling is easier to deal with). I know this both from the side of the referee as well as from the side of the author.

The problem is, nobody makes grammatical mistakes deliberately. Just telling somebody they're making mistakes doesn't help. Since I've left school, I have never received any feedback on my English grammar, except for the one or other case on this blog. My English has benefited from several years in North America, not everybody had that chance. But I'm in no position to correct somebody else's grammar. Thus, I don't do it in my referee reports, editors don't do it, and in the end the paper remains unreadable (the worst don't get published however because one can't understand them at all).

Journal editors aren't hired for English grammar, they're hired because they know the field. Basically, it's an issue that nobody feels responsible for. Best,

B.

Zephir said...

/* . By the late 1990s, most papers are difficult to understand.. */
In dense aether theory the evolution of human understanding roughly follows the geometry of surface wave spreading. Up to certain distance this spreading remains low-dimensional and essentially deterministic from surface wave perspective. After certain distance the dispersion takes place and the observational perspective becomes hyperdimensional and increasingly complex. The contemporary generation of physicists simply deals with hyperdimensional phenomena (dark matter, supersymmetry,..), the observation of which the development of technology enabled. Accordingly, their formal models had become poorly conditioned and fragmented (many versions of string theory as an example).

In addition, just the late of 1990s is the period, when relative overemployment of theoretical physics emerged. You're not required to keep so many formally thinking theorists for description of indeterministic phenomena. In addition, the depletion of fossil fuel and the increasing economical crisis poses the pressure to the grant system of physicists, which are required to spend much more time just with asking of another money. The quality of research work and its presentation at public stagnates as the result.

Phillip Helbig said...

It's probably a combination of 2, 4 and 5. Probably 4 is the most important. When it takes much longer to write a paper (I recall someone talking about writing with a typewriter: one page of good copy per day was the goal), investing a few hours in careful proofreading doesn't make much difference, while this is a substantial fraction of writing (not doing the work, just writing it up) with LaTeX.

Charles Day said...

As an editor at Physics Today, I read a lot of papers in different fields from several journals. The quality of papers—as written expositions of research, rather than repositories of research results—is mixed. Some papers are paragons of clarity; others are hard to follow even, I suspect, for experts.

In my experience, papers in APS's Physical Review Letters have consistently clear, accessible introductions—which suggests that Steve W is correct: editors and editorial policies are to blame for the trend you spotted. If PRL can insist on good writing, so can other journals.

Mason said...

Hmpf... from my perspective, PRL is one of the worst offenders when it comes to poorly-written articles! The 4-page restriction shortens the room for motivation/context, intermediate steps to show where one actually gets a result (!). One thing I like occurs in the journal Chaos (though many authors don't do it well): in addition to the abstract, there is an introductory bold paragraph meant to give that big picture. There are some great papers in PRL, but overall I think the articles in "archival" journals are written much better on average.

Eric said...

I think there might be a simple explanation for the abrupt change in style and readability of papers in the mid nineties. If I remember things correctly the mid to late nineties is when the Internet really caught fire. Suddenly information that was previously available to only a few in the field was available to all with a decent Internet connection.

Along with this new access to information by the rabble (me and others with interest but no PhD in the field) a new selection process had to take place to keep us from mucking things up. By coming up with more esoteric terms, less definition of terms, more complex math etc., the rabble was better able to be kept out of the loop even if they had access to the papers. I think the new style of papers may be a new kind of secret code used just for this purpose.

Unfortunately it has the side effect of inculcating the feeling among researchers that their papers may have more worth than they have gist because there are only a few researchers who understand the papers. So it reinforces bizarre ideas that are often not just counterintuitive but plainly wrong.

Garrett said...

I think there's been a cultural shift towards obfuscation in some branches of the physics literature, with papers written to establish precedence and cleverness rather than clearly and comprehensively transmitting ideas. Since the community is relatively small, this shift might have been driven by the writing of a few popular individuals, who's style was then copied. Difficult thing to judge objectively though. I suppose this mostly counts as (3).

Rastus Odinga Odinga said...

I know what you mean, but on the other hand, especially in the 90's, Edward Witten was writing in a new style in which he really tried very hard to motivate what he was doing. [Take any of the papers he wrote in 1998 for example.] The people who followed that style, and they were not few, really did improve the intelligibility of papers in that field. So I can't agree that it has all been downhill. Sadly, as Witten has moved off into territory where people [quite rightly] don't want to go, his influence on style has decayed along with his influence on physics. But there are still some people writing in that way, eg Gubser.

Bee said...

Hi Charles, Mason,

I think you're talking about two different types of readability. PRL is readable in the sense that the papers make an effort to explain the topic in terms that anybody with an education in the field can understand, rather than only addressing a very specialized audience. They're unreadable in that, once you've understood that, you might want to know the details, but then there's not enough space. But usually you can find a longer paper somewhere. PRL is also one of the journals I had in mind above when saying that I did receive suggestions for corrections on my grammar.

I am not sure though that saying it's the editor's responsibility really explains why the style has been changing. Best,

B.

Bee said...

Hi Garrett, Rastus,

That's an interesting point. It had not occurred to me that it might be some people have 'lead by example.' It would be good to know though if this happened in other fields as well. If so, it seems unlikely these 'examples' were uncorrelated. Best,

B.

Eric said...

I think I should say about my previous comment that I wasn't saying the "rabble" , including me, would still be able to keep up with the existing pre 1990 papers. Speaking for myself, especially in the math department, I have sufficient deficits that I would never understand many of those papers either. But I think many of us would "get" what was being said with a little more background material. And I think I would even understand the simpler forms of math, up to and including basic derivatives and integrals. Maybe not tensor math so much.

I wasn't trying to lift myself and others up in my previous comment so much as saying, at least in physics, that
It operates much like guilds of old. There is a protective agency that attaches to that difficultly acquired knowledge. But I really think this method of obfuscation to publicly available information serves physics poorly.

Sorry if I said it previously in a way that came off as self serving. Wasn't meant that way.

Anonymous Snowboarder said...

Bee - I think point #5 and perhaps #4. I think #4 has shifted a portion of the initial vetting of papers from third parties to the author who frankly has less interest in grammar, spelling and understandability to others. #5 I think is a large factor and could apply as well outside of the physics community. Just about every publication I read today (especially those now web only and formerly print only) have gone down hill. Editing (which should also include in some cases fact checking) is cursory at best in order to get copy "out there" asap. It really does a disservice to everyone including future readers.

I do think a #6 needs to be included... the tendency to publish simply for publishing sake. How many papers are really worth the paper they are written on? How many "results" could be included with others to publish a more comprehensive and meaningful paper?

Back in the day when I read (or tried to) PR D I always checked the rapid communictions and errata first. I think errata are actually an indication of worth of the paper - if it were not at least marginally important it would be completely passed by. That readers have comments and criticisms is generally a good sign. Perhaps you have missed the mark but you are doing something right.

Kaleberg said...

From http://arxiv.org/help/general - "Started in August 1991, arXiv.org (formerly xxx.lanl.gov) is a highly-automated electronic archive and distribution server for research articles."

That's probably unfair, but suggestive.

rab said...

at 4, "breeds carelessness"...

M*P*Lockwood said...

As I read this entire post I was thinking of what Kaleberg suggests. Arxiv, but also the internet in general. I can't recall if you had a post here about it, but I've read in several places how access to the internet affects our thinking. People are less likely to remember something if they're aware that they will be able to look it up again at any time. Probably people are also less likely to explain something if they feel like their reader has nearly instant access to the explanation. I've definitely noticed this effect in other forms of writing.

Bee said...

Yes, I wrote about that a few times. However, I don't find it a very plausible explanation. Though this might affect us over the course of generations, people don't change the way they think within a few years (sometimes I wish they did) and by the mid 90s the internet was still slow and had limited use. This is why I was mentioning more specifically distribution changes (due to the internet). Best,

B.