Monday, July 16, 2018 A new tool for arXiv users

Time is money. It’s also short. And so we save time wherever we can, even when we describe our own research. All too often, one word must do: You are a cosmologist, or a particle physicist, or a string theorist. You work on condensed matter, or quantum optics, or plasma physics.

Most departments of physics use such simple classifications. But our scientific interests cannot be so easily classified. All too often, one word is not enough.

Each scientists has their own, unique, research interests. Maybe you work on astrophysics and cosmology and particle physics and quantum gravity. Maybe you work on condensed matter physics and quantum computing and quantitative finance.

Whatever your research interests, now you can show off its full breadth, not in one word, but in one image. On our new website SciMeter, you can create a keyword cloud from your arXiv papers. For example here is the cloud for Stephen Hawking’s papers:

You can also search for similar authors and for people who have worked on a certain topic, or a set of topics.

As I promised previously, on this website you can also find out your broadness-value (it is listed below the cloud). Please note that the value we quote on the website is standard deviations from the average, so that negative values of broadness are below average and positive values above. Also keep in mind that we measure the broadness relative to the total average, ie for all arXiv categories.

While this website is mostly aimed at authors in the field of physics, we hope it will also be of use to journalists looking for an expert or for editors looking for reviewers.

The software for this website was developed by Tom Price and Tobias Mistele, who were funded on an FQXi minigrant. It is entirely non-profit and we do not plan on making money with it. This means maintaining and expanding this service (eg to include other data) will only be possible if we can find sponsors.

If you encounter any problems with the website, please to not submit the issue here, but use the form that you find on the help-page.


  1. This is so cool. Love it. Thank you.
    How much effort involved to get this to run on vixra? Would be a good tool for finding new ideas that have not yet been vetted by referees, if one is at least a little daring and curious. And perhaps appropriate for a follow-on fqxi grant.

  2. Hi Sabine,

    The app is really interesting! What does the sign on the broadness number mean? I seem to get negative numbers for authors I'd assume are broad (e.g. Witten, Edward) but sometimes get positive numbers for authors I'd also think are broad (Penrose, Roger).

    Alex Arvanitakis

  3. Very nice Dr. H. Does the software do this all on its own from the raw documents, or are the documents indexed first by their central topics?

  4. "How much effort involved to get this to run on vixra?"

    Probably not too much effort, but way too much courage.

  5. Two words: "excessively clever" "8^>)

    Sites like must be specifically and vigorously excluded lest you lend credibility while burning cpu time. Imagine Ouroboros without the snake. The sump has holes, then holes all the way down.

    viXra:1803.0014 "On the Existence of an Autotautological Physical Theory"
    ... "The World is an autology derived from all tautologies"

  6. Matthew,

    We actually don't use the full text body, but only the title and abstract. I don't know what you mean by "central topic". Best,


    1. Thanks Dr. H. You answered it. The title and abstract would certainly emphasize (be focused on) the "central topic" of the article.

  7. Peter,

    Well, it took us about half a year, maybe this gives you some impression of the effort?

  8. Alex,

    The normalized broadness that the website returns is standard deviations above or below arXiv average. The negative values are below average, positive values above.

    The broadness measures how many connections the topics an author worked (works) on have with the arXiv topics overall. Negative values of broadness are typical for highly specialized topics that have little relevance for other areas of physics. You can look at the broadness values of arXiv categories overall (see this blogpost) to get an impression for which fields have many and which few connections. High energy physics has a very small broadness in general, so do some areas of astrophysics, and nuclear physics. Best,


  9. "How much effort involved to get this to run on vixra? Would be a good tool for finding new ideas that have not yet been vetted by referees,"

    The best way to do that is to submit the paper to a journal.

    "if one is at least a little daring and curious. And perhaps appropriate for a follow-on fqxi grant."

    I doubt it. FQXI is at the edge of the respectable spectrum anyway. That's fine; they are intentionally high-risk/high potential payoff. But messing with viXra would be way too speculative, even for FQXI.

  10. This is a lot of fun, thanks! I hope you will be able to use some of your copious free time to set up a version which allows restriction to broad subfields, eg mathematics?

  11. A suggestion -- Wikipedia says a word cloud (tag cloud) is a "novelty visual representation," and some of your commenters are saying "fun" and "cool." OK, all good, if that's what you want. But if you want "useful," then the devs might consider losing the vertical text, which is just annoying (at least to me). As several examples in the Wikipedia word-cloud article show, you can make a perfectly nice-looking cloud using only horizontal left-to-right text. Much easier to read, especially for the smaller-font entries. But probably not as cool.

  12. Jim,

    Check the box "all words horizontal" just above the cloud.

  13. Sabine, Uncle Al, Phil,...

    Agree it is not easy to find reputable work on vixra. However, if one seeks to untangle the log jam that followed the 1974 bifurcation of gauge and string theories, then it appears to be equally if not more difficult to find reputable work on arxiv, and easy to find a lot of good old boy bullshit.

    Obstacles to new ideas, ideas that actually shift the leverage in fundamental physics, are much more formidable than one might naively imagine, even for an insider arriving at this moment via the Michigan high energy spin physics group, MIT Bose condensation group, and Brookhaven, led the team that developed feedback loops for fast transverse optics in RHIC, the Tevatron, and LHC,...

    Awareness of 'cognitive bias' becomes ever greater as we struggle to understand how theoretical particle physicists has become so ethereally confused. Tho it goes far beyond that, wired into the autonomic nervous system, into the first filter of cognitive repression, into the balance between sympathetic and parasympatheic, between fight or flight and relaxation responses. The mind thinks with the body, uses the body to think. think about it.

    With the code now extant for arxiv, the easiest next app is surely the dark twin, the goat, the family blacksheep, yes? Just common sense tells us that. For those not lacking either curiousity or courage.

  14. Srsly:

    You might consider licensing this to a garment manufacturer, for individually printed
    T-shirts and other such merch. A small market, perhaps, but maybe not negligible.

  15. rabraha3,

    Can you send me an email re those leaking logs & your interest in contributing? Email is hossi[at]

  16. @peter cameron re viXra: Rigorously derived Euclid cannot be internally falsified, but…a globe! 2-space has three primary geometries. Mathematics is consistent and rigorous but not empirical, not science.

    Product of primes, 2×3×5×7×... = 4π²
    Sum of positive integers, 1+2+3+4+5+... = -1/12
    Product of positive integers, 1×2×3×4... (infinity)! = √(2π)

    Physical theory is math immune to internal correction. External falsification suffers parameters (curve fits). viXra (bad jokes) or arXiv (punchless punchlines), physical theory is the undead, publishing.

    QM and GR suffer Big Bang baryogenesis then Sakharov conditions - specifically excluded by "accepted" theory. One hour of microwave spectrometry of a synthesized molecule can falsify both, outside accepted theory. Unpublishable! 68,000 tons of liquid argon 1.5 km underground for a decade is acceptable.

  17. Uncle Al - thanks for the thoughts. what i got from it:
    1. first paragraph. parabolic and hyperbolic two-spaces falsify Euclid. but why bother? In QM 'curved space' is just another phase shift in flat Minkowski spacetime. Hestenes uses remarks of Poincare (and his own) to point out foolishness of curved space. It was foisted on physics by the math folks circa ~1916, who had a ready made model courtesy 1860s Riemann. They blindsided the physicists. Einstein objected to that interpretation of his work. Can dig out references for you.
    re consistent/rigorous/empirical/science i sorta get the drift, but philosophy knowledge is not so good, have no comment other than don't see how it relates to vixraphobia.
    2. second para puzzles - here you define a topology? and what else? i'm not a numbers person. please explain. and ditto vixraphobia
    3. third para, again weak in philosophy, no comment here other than don't see your purpose in drifting into this sort of thing, how it actually relates to my comments.
    4. send links to your work.

  18. @peter cameron. Pioneer anomaly, OPERA superluminal muon neutrinos, slow light. Contemporary physics lacks predicted observables despite being hugely curve-fitted. Physics, you, assume defective postulates. They cannot be internally identified.

    Big Bang baryogenesis and Sakharov criteria, Noether's theorems. Ignoring baryogenesis offers clean, tight maths that are intrinsically defective for describing the universe. Obtain a high finesse microwave rotational spectrum and blow a resolved chiral molecular beam through a grating to falsify "accepted" theory. 500K theorist-years of physics fear a day of analytical chemistry.

    With Bee's permission, pdf stereograms of two offending molecules built to have no wiggle room and intense outputs, plus a link to derivation, citations, and synthetic paths. Science begins with observations, then relevant postulates, then the maths.

  19. To Sabine Hossenfelder: Forgive me for just jumpining into this stream of comments, but I just read your article in The Week regarding a preoccupation with aesthetics in physics. I have seen this idea hinted at elsewhere, but you brought it to the full light of day. Bravo! With that said, I wonder if the problem goes beyond an inattention to the evidence as the driver of theory and experimentation. Perhaps it is a narrow understanding of beauty. My partner of recent years has taugh art around the world and I am regularly stunned by her perceptions of what is "beautiful" compared to mine. What at first leaves me cold, can becomes stunningly beautiful as I hear her description or I as I try to see it through her eyes...patterns appear, connections are made, new ideas form. So, while I agree that evidence should drive research, abandoning beauty as a useful lense seems a bit draconian. I would recommend that delving into the arts (and particularly the cutting edge of the arts) more deeply might help transform an old and stodgy aesthetic (which is currently blocking progress in physics and likely other disciplines) into a more useful concept of beauty that can serve as a sieve for ideas.

  20. "You might consider licensing this to a garment manufacturer, for individually printed
    T-shirts and other such merch. A small market, perhaps, but maybe not negligible." Hey, I'd buy one with my stuff on it in a heartbeat!

  21. The app looks good!

    I‘m sorry I didn‘t reply last year to your comment when I offered my help on this project. Back then I had the ill-founded idea that it would be possible to have a side-project whilst having a newborn baby at home.

    Great to see that it worked out anyway :)


COMMENTS ON THIS BLOG ARE PERMANENTLY CLOSED. You can join the discussion on Patreon.

Note: Only a member of this blog may post a comment.