Wednesday, May 08, 2019

Measuring science the right way. Your way.

Today, I am happy to announce the first major update of our website SciMeter.org. This update brings us a big step closer to the original vision: A simple way to calculate what you, personally, think quantifies good research.

Evaluating scientific work, of course, requires in-depth studies of research publications. But in practice, we often need to make quick, quantitative comparisons, especially those of us who serve on committees or who must document the performance of their institutions (or both). These comparisons rely on metrics for scientific impact, that are measures to quantify the relevance of research, both on the administrative level and on the individual level. Many scientists are rightfully skeptical of such attempts to measure their success, but quantitative evaluation is and will remain necessary.

However, metrics for scientific impact, while necessary, can negatively influence the behavior of researchers. If a high score on a certain metric counts as success, this creates an incentive for researchers to work towards increasing their score, rather than using their own judgement for good research. This problem of “perverse incentives” and “gaming measures” is widely known. It has been subject of countless talks and papers and manifestos, yet little has been done to solve the problem. With SciMeter we hope to move towards a solution.

A major reason that measures for scientific success can redirect research interests is that few measures are readily available. While the literature on bibliometrics and scientometrics contains hundreds of proposals, most researchers presently draw on only a handful of indicators that are easy to obtain. This is notably the number of citations, the number of publications, the Hirsch-index, and the number of papers with high impact factor. These measures have come to define what “good research” means just because in practice we have no alternative measures. But such a narrow definition of success streamlines research strategies and agendas. And this brings the risk that scientific exploration becomes inefficient and stalls.

With SciMeter, we want to work against this streamlining. SciMeter allows everyone to create their own measures to capture what they personally consider the best way of quantifying scientific impact. The self-created measures can then be used to evaluate individuals and to quickly sort lists, for example lists of applicants. Since your measures can always be adapted, this counteracts streamlining and makes gaming impossible.

SciMeter is really not a website, it’s a web-interface. It allows you make your own analysis of publication data. Right now our database contains only papers from arXiv.org. That’s not because we only care about physicists, but because we have to start somewhere! For this reason, I must caution you that it makes no sense to compare the absolute numbers from SciMeter to the absolute numbers from other services, like InSpire or Google. The analysis you can do with our website should be used only for relative comparison within our interface.

For example, you will almost certainly find that your h-index is lower on SciMeter than on InSpire. Do not panic! This is simply because our citation count is incomplete.

Besides comparing authors with each other, and sorting lists according to custom-designed metrics, you can also use our default lists. Besides all authors, we also have pre-defined lists for authors by arXiv category (where we count the main category that an author publishes in), and we have lists for male and female authors (the same gender-id that we also used for our check of Strumia’s results).

The update also now lets you get a ten-year neural-net prediction for the h-index (that was the basis of our recent paper, arxiv version here). You find this feature among the “apps”. We have also made some minor improvements on the algorithm for the keyword clouds. And on the app-page you now also find the search for “similar authors” or authors who have published on any combination of keywords. You may find this app handy if you look for people to invite to conferences or, if you are a science writer, if you are looking for someone to comment.

SciMeter was developed by Tom Price and Tobias Mistele, and was so-far financed by the Foundational Questions Institute. I also want to thank the various test-users who helped us to trouble-shoot an early version and contributed significantly to the user-friendliness of the website.

In the video below I briefly show you how the custom metrics work. I hope you enjoy the update!

14 comments:

  1. Congratulations for trying to find better ways to quantify scientific research and success. I'll be watching with interest so see how it work out and evolves.

    ReplyDelete
  2. A statistical analysis based evaluations using AI software technology of the data contained in the SciMeter database could help reduce the effort in making the in-depth studies of research publications to aid in current research much like the way AI makes diagnosis of disease using info contained in medical case study's and medical text books.

    ReplyDelete
  3. Sabine, this is a useful tool. However, I'm a little confused. I went to the website for Scimeter and read the "About Us" page, and there seem to be some differences between what's said there and what you're saying here.

    You say that this is an update that brings you closer to the original vision, but the "vision" seems a bit different on the website. For example, there isn't anything in the "About Us" that mentions personal evaluations of good research. Maybe it's just me and I'm missing something.

    Also, I think it might be helpful if someone created a Wiki page for Scimeter. When I did a wiki search for it, it gave me "scimitar," a rather wicked-looking weapon. But maybe it would be kind of fun to slash bad research papers with a scimitar. :-)

    ReplyDelete
    Replies
    1. Steven,

      We've put this into the "About Metrics" section. I think you are right that we should update the "About Us" text though. Yes, I will try and see whether we can set up a brief Wikipedia entry.

      Delete
  4. I'll add an update to my previous comment. I took a second look at the Scimeter website and noticed another link in the "About" section labeled "About Metrics." That is where you talk about "measuring science the right way."

    Lesson learned: Next time, look at all the links in the "About" menu to see what a website is all about. :-)

    ReplyDelete
    Replies
    1. Sorry, I only read your second comment after I responded to the first. Really I wrote the "About Metrics" text new and just moved the old introduction text to "About us", so that's what happened.

      Delete
  5. Are there plans to incorporate other data beyond the arxiv, or to allow users to use some of the features "by hand" if needed for papers which are not there? Unfortunately I've not been as diligent about posting my research there (and often collaborators are unwilling), but I really like some of the tools you've developed so far.

    ReplyDelete
    Replies
    1. Andrew,

      I have plans, yes, but presently I don't have money to make those plans reality. Please note that much of this data is proprietary.

      Delete
  6. For hiring purposes it would be useful if you can compare users from a given 'scientific age'. For instance, all people with a first arxiv posting from 2012 (or whatever). Since most metrics improve with age, this would be useful. Maybe this is already possible but I could not figure it out from the site. Anyway, thanks for making this possible.

    ReplyDelete
  7. Fantastic initiative! Congratulations! That's a start at trying to fix things, of course it is still very early but someone has to start exploring this space.

    ReplyDelete
  8. Thank you!

    I would like to offer some comment on the problem of authors of large collaborations. I noticed (using the author comparisons with the neural net normalized h-index), that this basically mainly relies on conference proceedings (or papers outside the collaborations, but conference proceedings are treated on equal footing).
    Now, at least for the large LHC (but also Tevatron) collaborations, conference talks are always ever on published material. That means there is no scientific value in writing a proceeding as a means to communicate findings (since these are usually reported in more depth in the respective publications). There is merit for early career researchers to write 1-2 proceedings as practice, but otherwise you would hope that they write papers inside the collaborations. The fact that a conference talk was given (these are distributed on a merit basis) is probably counting more than the proceedings in the general scheme of things.

    I fully understand and agree, that there is no way around this as there is no way to access internal collaboration information (though internally, at least for some collaborations, there is a record of who did what on the internal paper author list and usually one can understand from reading this info, how much a given person approximately contributed).

    As said, I fully appreciate the problem, I just wanted to point it out.

    ReplyDelete
  9. I have some real concerns about some of the metrics used in the system - in particular the 'Sum of impact factors' which is defined as "The sum of the journal impact factors for each paper (for which a journal impact factor could be found)".

    Impact factors are a journal level metric and the dangers of applying them at article or author level are well documented. They also can't be used to compare journals from different disciplines (given the variation in average citation rates) so simply adding the impact factors of the journals each paper was published in seems about as misleading and inappropriate metric as you could get, especially when used as the basis of hiring or promotion decisions. Perhaps some in-platform context and guidance about how to build a 'responsible' metric, or health warnings about the drawbacks of metrics such as the Impact Factor and h-Index would be useful?

    ReplyDelete
    Replies
    1. No one forces you to use measures you don't like.

      Delete

PLEASE READ THE COMMENT RULES BEFORE COMMENTING.

Comment moderation on this blog is turned on.
Submitted comments will only appear after manual approval, which can take up to 24 hours.
Comments posted as "Unknown" go straight to junk. You may have to click on the orange-white blogger icon next to your name to change to a different account.