Tuesday, February 05, 2013

Consequences of using the journal impact factor

An interesting paper that should be mandatory literature for everybody making decisions on grant or job application, especially for those people impressed by high profile journals on publication lists:
It's a literature review that sends a clear message about the journal impact factor. The authors argue the impact factor is useless in the best case and harmful to science in the worst case.

The annually updated Thomson Reuters journal impact factor (IF) is, in principle, the number of citations to articles in a journal divided by the number of all articles in that journal. In practice, there is some ambiguity about what counts as "article" that is subject of negotiation with Thomson Reuters. For example, journals that publish editorials will not want them to count among the articles because they get rarely cited in the scientific literature. Unfortunately, this freedom in negotiation results in a lack of transparency that casts doubt on the objectivity of the IF. While I knew that, the problem seems to be worse than I thought. Brembs and Munafò quote some findings:
"For instance, the numerator and denominator values for Current Biology in 2002 and 2003 indicate that while the number of citations remained relatively constant, the number of published articles dropped...

In an attempt to test the accuracy of the ranking of some of their journals by IF, Rockefeller University Press purchased access to the citation data of their journals and some competitors. They found numerous discrepancies between the data they received and the published rankings, sometimes leading to differences of up to 19% [86]. When asked to explain this discrepancy, Thomson Reuters replied that they routinely use several different databases and had accidentally sent Rockefeller University Press the wrong one. Despite this, a second database sent also did not match the published records. This is only one of a number reported errors and inconsistencies [87,88]."
(For references in this and the following quotes, please see Brembs and Munafò's paper.)

That is already a bad starting point. But more interesting is that, even though there are surveys confirming that the IF captures quite well researcher's perception of high impact, if one looks at the numbers, it actually doesn't tell much about the promise of articles in these journals:

"[J]ournal rank is a measurable, but unexpectedly weak predictor of future citations [26,55–59]... The data presented in a recent analysis of the development of [the] correlations between journal rank and future citations over the period from 1902-2009 reveal[s that]... the coefficient of determination between journal rank and citations was always in the range of ~0.1 to 0.3 (i.e., very low)."
And that is despite there being reasons to expect a correlation because high profile journals put some effort into publicizing articles and you can expect people to cite high IF journals just to polish their reference list. However,
"The only measure of citation count that does correlate strongly with journal rank (negatively) is the number of articles without any citations at all [63], supporting the argument that fewer articles in high-ranking journals go unread...

Even the assumption that selectivity might confer a citation advantage is challenged by evidence that, in the citation analysis by Google Scholar, only the most highly selective journals such as Nature and Science come out ahead over unselective preprint repositories such as ArXiv and RePEc (Research Papers in Economics) [64]."
So IFs of journals in publication lists don't tell you much. That scores as useless, but what's the harm? Well, there are some indications that studies published in high IF journals are less reliable, ie more likely to contain exaggerated claims and cannot later be reproduced.
"There are several converging lines of evidence which indicate that publications in high ranking journals are not only more likely to be fraudulent than articles in lower ranking journals, but also more likely to present discoveries which are less reliable (i.e., are inflated, or cannot subsequently be replicated).

Some of the sociological mechanisms behind these correlations have been documented, such as pressure to publish (preferably positive results in high-ranking journals), leading to the potential for decreased ethical standards [51] and increased publication bias in highly competitive fields [16]. The general increase in competitiveness, and the precariousness of scientific careers [52], may also lead to an increased publication bias across the sciences [53]. This evidence supports earlier propositions about social pressure being a major factor driving misconduct and publication bias [54], eventually culminating in retractions in the most extreme cases."
The "decline effect" (effects getting less pronounced in replications) and the problems with reproducability of published research findings have recently gotten quite some attention. The consequences for science that Brembs and Munafò warn of are
"It is conceivable that, for the last few decades, research institutions world-wide may have been hiring and promoting scientists who excel at marketing their work to top journals, but who are not necessarily equally good at conducting their research. Conversely, these institutions may have purged excellent scientists from their ranks, whose marketing skills did not meet institutional requirements. If this interpretation of the data is correct, we now have a generation of excellent marketers (possibly, but not necessarily also excellent scientists) as the leading figures of the scientific enterprise, constituting another potentially major contributing factor to the rise in retractions. This generation is now in charge of training the next generation of scientists, with all the foreseeable consequences for the reliability of scientific publications in the future."
Or, as I like to put it, you really have to be careful what secondary critera (publications in journals with high impact factor) you use to substitute for the primary goal (good science). If you use the wrong criteria you'll not only not reach an optimal configuration, but make it increasingly harder to ever get there because you're changing the background on which you're optimizing (selecting for people with non-optimal strategies).

It should clearly give us something to think that even Gordon Macomber, the new head of Thomson Reuters, warns of depending on publication and citation statistics.

Thanks to Jorge for drawing my attention to this paper.

10 comments:

Phillip Helbig said...

Even if citation counts, impact factors etc really mean something, surely what counts is having one's own paper cited, not just being in a journal with a high average citation rate.

Bee said...

Hi Phillip,

The paper is about the journal impact factor, not about citation count. The citation count has its own shortcomings, but I think the main point they are trying to make here is that a paper that got published in a high impact journal is not necessarily better than other papers just by merit of its shiny reference. In fact, it might be worse because people preferably submit their most outstanding results there, which are also more likely to be unreliable (at least that's what I read out of the summary of references, most of which I haven't read myself).

I don't know though how much of a concern the concerns are that they voice, in that I don't know of any study that tells to some extend how much influence the journals somebody published in have on them being hired or being awarded grants. Best,

B.

Phillip Helbig said...

I agree; my point is that if one considers the shortcomings of using the citation count, it should be even less desirable to use essentially the average citation count in the journal. It's like spending time with tall people in the hope that one might grow. :-)

"to some extend" ---> "to what extent"

I think the publication record matters in both cases, but the details of how it is evaluated vary quite a bit from place to place.

Uncle Al said...

Management obsesses on what is measurable instead of promoting what is important. How many PhD candidates did Feynman generate? Discharge for cause: expectation of productivity. The least lovable people - the ones automatically not admitted or hired - are fonts of important discovery. Innovation is insubordination.

Hierarchal management is worse than random choice. A large slice of grant funding should support the young and weirdly creative - exactly opposite to reality. TRIZ, the theory invention, tells us how to succeed: "do it the other way." A spectacular research and instant communication empire stares into its collective bellybutton, earning productivity bonuses for quantifying lint.

Erik said...

On-topic: Reading this, I am glad that I am still a student and don't need to worry about my citations :)

Off-topic: I am starting my first course on string theory. Does anyone have an opinion about which book I should use?

X said...

Counterargument! Assessing a researcher's large body of work is time-consuming and boring. Therefore, using impact factor, letters of recommendation and hiring your friends improves efficiency, leading to a maximization of science by freeing up the hiring committee's time.

Thomas Schaefer said...

The whole impact factor calculation is completely idiotic. Impact factor only includes citations to papers that are less than two years old, and therefore penalizes truly important publications, in particular papers that take a while to be recognized and then accumulate citations for a long time. On the other hand, ``me too'' papers that are cited for a little while but then forgotten are included.

Bee said...

X: That's not a counterargument, that's the reason for the problem. Saving time is why people use simple measures for scientific success. It's an individual advantage in the short term, a big incentive. Unfortunately, in the long-term it generates a non-optimal trend. If there was no advantage to using metrics whatsoever, there wouldn't be any problem because nobody would use them. Best,

B.

Nemo said...

Hi @Eric,

Lubos Motl commented on some books here:

http://motls.blogspot.com/2006/11/string-theory-textbooks.html

and on some very recent ones here

http://motls.blogspot.com/2012/04/explosion-of-high-brow-stringsugra.html

Have fun ;-)

Cheers

Sabine Hossenfelder said...

Physics Today has comment on this paper too:

http://blogs.physicstoday.org/thedayside/2013/01/unexpected-consequences-of-journal-rank/?type=PTFAVE