Sunday, December 30, 2007

Trend of arXiv submissions: Update

A couple of days ago I was wondering how the trend of average monthly submissions on the arXiv would continue this year. Here is the updated plot with the statistics from 2007:



As one sees the trends do more or less continue. The average number of submissions on the hep arXiv are shown in blue and include hep-th/hep-ph/hep-lat and hep-ex. Red is astro-ph, green is cond mat, violet is math with the pink addition being math-ph. The clear extensions are cross-links. The number of hep submission slightly increased again, so it seems the temporary drop was a fluctuation. It seems to stagnate around 730 papers/month.

I am still wondering how it comes though. Several people have argued this is to be expected and just suggests 100% participation from that community (this is also the explanation you find on the arXiv website). I don't think though this can be the full story. I would expect the average number of submissions to be roughly a product of

- the total number of people in the community
- their average productivity
- the fraction of them using the arxiv

First, it is plausible to me the fraction of people using the arxiv in the hep community is saturated in North America and Europe, but given that there are still new countries coming into the game and the arXiv is as global as can be I am not sure whether this actually holds. That is to say, globally seen I'd still expect that fraction to be increasing.

Second, since the world population generally is increasing, one should think this general trend underlies the statistics, unless the fraction of people in the hep-community relative to the total population decreases.

Think China, Japan, India. Then have a look at the submitter's affiliations and tell me one should not expect further growth. I mean, yeah, there are a lot of Germans, but we don't outweigh the rest of the world.

Taken together this means, if people stick to the arxiv either the number of people in the community doesn't grow as I'd expect, or they publish less. Another possibility, as Dylan mentioned, would be that people just submit papers to other arXivs. E.g. it might be that some stuff that used to end up on hep, now goes into physics general, or history of physics. Or, maybe now that there are more online journals that publish more or less immediately, people don't submit all their papers to the pre-print server? Anybody noticed something like this?

The other thing that I found surprising is the sheer number of math publications. Not the trend - I understand that probably that community is still not completely arxiv-ed - but gee, look at all those math papers! I always had the impression mathematicians publish only sparingly, seems I was somewhat mistaken there.

For more statistics see here.

6 comments:

Peter Woit said...

I would guess that mathematicians write somewhat fewer papers than physicists, but there are a lot of mathematicians, especially when you compare to a subfield of physics like HEP. At sizable research universities like Columbia, the math department is about an order of magnitude larger than the HEP group in the physics department.

Many mathematicians still don't post preprints to the arXiv, sometimes on the grounds that a claimed result whose proof has not been checked by a referee should not be publicly disseminated. This is changing, and I think over the next few years the number of math preprints will keep growing, with the final number completely overwhelming the number of HEP preprints.

I don't see any reason to believe that physicists not in the US/Western Europe have been slow to adopt the practice of posting to the arXiv. Quite the opposite, since they traditionally had trouble paying for mailing out paper preprints. It is true that China is producing large numbers of new Ph.Ds, and there should be increasing numbers of publications from them.

muon said...

Your interpretation and Peter's are both good ones. Here's another thought: the LEP, SLC and Tevatron experiments confirmed the standard model through the nineties. There was wide-spread hope that LEP II would find the Higgs and Supersymmetry, resulting in a tidal wave of phenomenological papers. But by 2001-2 it was clear that LEP had no strong signals for any new physics. After that, theorists continued to invent new models of various sorts, but without any major excitement from experiment, this settled down to a constant output per year since the end of LEP. It follows that there should be a major increase in arxiv submissions once the LHC starts delivering results - especially if signals of new physics are found. Let's see what this graph looks like in four or five years... ;)

Phil Warnell said...

Perhaps there is a blessing to be found here, which is at least in the hep category, it appears to leveling off somewhat. I can’t remember just where I read it, or who said it; none the less the comment was about this exponential rise in published papers. The author saying that at the present time it was almost impossible even in ones own specialty to keep up. The concern raised was that the shear volume, could become so overwhelming, that it will lead to a sort of academic or intellectual melt down. Perhaps this data is indicating some sort of a feed back loop, which serves to adjust for this. That is in order to publish one must attempt to have it consistent with what has preceded. That means in the simplistic sense, scientists are now required to read so much, it serves to self limit the writing. I will admit it is only a half baked contention; however it may serve to add some fuel to the fire.

Bee said...

Hi Phil:

Yes, as I wrote in the comments to the earlier post, I would consider it a good thing if people would publish less. If only because it seems to indicate that the publication pressure goes from quantity hopefully towards quality. However, I recall very vividly a plot showing the number of scientific publications I saw two years ago (maybe in Physics Today?). I couldn't find a reference unfortunately, so you'll have to believe me. It showed the number of publications in Europe catching up and exceeding those from North America some time around 2004. But more interesting in this regard, the curve from Asia, still growing steeply. Added up the total curve would still be growing, that's what I would have expected for the arXiv as well. Also, the plot I have in mind was showing peer review publication. I am probably not the only one who has the impression that the arXiv seems to collect an increasing number not of pre-print, but of never-to-be-printed papers, which are easier to produce.

Hi Peter:

Yeah, having thought about it I agree that I should have expected the number of publ in the maths arxiv to be growing such high, I was just surprised for no good reason actually.

Reg. the arXiv use, that would mean in those places where there is a possibility to use the arxiv, they would do it preferably over other means. I just think that the number of these people that start using the arxiv (globally) should still be growing.

It would be interesting if one could have the above statistic broken down by continent.

Best,

B.

rillian said...

Phil, your quote reminds me of a joke from a previous incarnation of this discussion:

PRL is expanding across library shelves faster than the speed of light. But that's ok since no information is being transmitted!

Phil Warnell said...

Hi Bee,

Like I said it was only a half baked idea. However, your comment as to the increasing number of papers originating outside the U.S. I believe is related to something else. That something else is that the average aptitude (understanding) of science in the U.S. as opposed to other countries has plummeted over the years. This is born out in the release of the latest data PISA (The Programme for International Student Assessment) published. In this study, out of the 57 participating countries which tested their 15 year old students, the U.S. ranked only 29th among them, in terms of science acumen. I’m happy to report that Canada ranked 3rd. With this result I think that spread is not only explainable but can be expected to increase. The complete list as ranked is follows:

1 Finland
2 Hong Kong-China
3 Canada
4 Chinese Taipei
5 Estonia
6 Japan
7 New Zealand
8 Australia
9 Netherlands
10 Liechtenstein
11 Korea
12 Slovenia
13 Germany
14 United Kingdom
15 Czech Republic
16 Switzerland
17 Macao-China
18 Austria
19 Belgium
20 Ireland
21 Hungary
22 Sweden
23 Poland
24 Denmark
25 France
26 Croatia
27 Iceland
28 Latvia
29 United States
30 Slovak Republic
31 Spain
32 Lithuania
33 Norway
34 Luxembourg
35 Russian Federation
36 Italy
37 Portugal
38 Greece
39 Israel
40 Chile
41 Serbia
42 Bulgaria
43 Uruguay
44 Turkey
45 Jordan
46 Thailand
47 Romania
48 Montenegro
49 Mexico
50 Indonesia
51 Argentina
52 Brazil
53 Colombia
54 Tunisia
55 Azerbaijan
56 Qatar
57 Kyrgyzstan