Sunday, August 18, 2013

Researchers and coffee consumption

You might have seen this collection of 40 world maps in your news feed recently. It's interesting and worth a look. When I scrolled down the list I thought it looks like the number of researchers (per million inhabitants) is correlated with the coffee consumption (in kg per capita). So I pulled down the data and plotted it in excel and here we go:

Coffee consumption vs number of researchers. The red dot is Germany.

I passionately hate excel and I have no idea how to convince it to give me a p-value, but I've seen worse correlations being published. More coffee consumption linked to more research!

If you want to play with the data, you can download the excel sheet here. I've left out Singapore from the table because I wasn't sure whether the entry "0" meant there's no data, or nobody in Singapore drinks coffee. I've made a second plot where I left out the 15 main coffee export countries (according to Wikipedia), but visually it doesn't make much of a difference so I'm not showing you the graph. (It's in the excel sheet.) According to chartsbin.com the data on researchers per million inhabitants is from the UNESCO Institute for Statistics, and the data on coffee consumption is from the World Resources Institute.

Don't take this too seriously. I'd guess that you'd find a similar correlation for many consume goods. It has some amusement value though :o)

16 comments:

uair01 said...

And what would you say of this one. It even sounds plausible:

The Effect of Sexual Activity on Wages

The purpose of this study is to estimate whether sexual activity is associated with wages, and also to estimate potential interactions between individuals’ characteristics, wages and sexual activity. The central hypothesis behind this research is that sexual activity, like health indicators and mental well-being, may be thought of as part of an individual’s set of productive traits that affect wages.

http://marginalrevolution.com/marginalrevolution/2013/08/the-effect-of-sexual-activity-on-wages.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+marginalrevolution%2Ffeed+%28Marginal+Revolution%29

MarkusM said...
This comment has been removed by the author.
Bart Coppens said...

Another similar and amusing food-related correlation can apparently be found between chocolate consumption and Nobel prizes.

Phil Warnell said...

Hi Bee,


What this has me now wondering is which of the two elements one should consider as causal and which resultant; or which fundamental as to having the other emergent :-)


“I have measured out my life with coffee spoons.”

-T.S. Eliot, “The Love Song of J. Alfred Prufrock”


Regards,

Phil

Arun said...

There should be a "Format Trendline" and the options in that allow you to display the equation. From your spreadsheet I get

y = 0.0011x + 0.7341
R^2=0.60786

Uncle Al said...

Paul Erdős' colleague Alfréd Rényi said, "a mathematician is a machine for turning coffee into theorems" (Erdos also did speed). High autists/Aspergers are focussed by stimulants. Some side effects may obtain.

The US Department of Education identifies the Severely and Profoundly Gifted, to destroy them as social justice for privleged minorities. Thus Warner Bros. Los Angeles County Honorship recipient Kashawn Campbell. Honorship? "He would like to major in communications and pursue a career in broadcasting." Perfect.

Bob said...

The product moment correlation coefficient for this data is 0.608; For 116 data points, the critical value for five sigma is 0.445.

You have about 8 sigma here: that's p~10^-15

Sabine Hossenfelder said...

Bob: I have a hard time believing your numbers.

Sabine Hossenfelder said...

Though, if I think about it, the figure is visually very misleading because one doesn't see that there's like 100 points or so crammed into the lower left corner, basically on top of each other.

Bob said...

Yes, the numbers are quite striking. I've tried to do it a little more reliably, but it isn't very easy. (Apparently, if you have Mathematica 9, you can use CorrelationTest to get a p-value. it doesn't appear to be in 8, which I have.)

The PMCC is generated by Excel, as Arun said. Actually, r=0.780, which is even higher. It's the squared value that is 0.608.

If you put 0.445 and 116 in this calculator, you'll see the five sigma (compare with five sigma here).

r=0.365 gives 4 sigma,
r=0.404 gives 4.5 sigma,
r=0.445 gives 5 sigma,
r=0.482 gives 5.5 sigma.

The calculator doesn't go much further than that, but the asymptotic behaviour is pretty clear. If you want r=0.780, you'll have about 9.25 sigma. Which is p=2 x 10^-20.

Bob said...

Alternatively, there's a formula here, which integrates to give p=3 x 10^-25.

It's getting smaller every time I look.

I think they're correlated.

Sabine Hossenfelder said...

Well, I guess if the null hypothesis is that each country irrespective of their researchers fraction consumes coffee somewhere in the observed range, then the probability that just by chance the 80 countries with a small fraction of researchers consume basically no coffee is indeed tiny. Of course, as I said in my post, that doesn't make much sense as a null hypothesis. It would make more sense to use correlation with GDP or maybe average income/average cost of living or some other measure of household liquidity or wealth.

Bob said...

Yes, if what you want is a causal hypothesis, it gets a lot trickier.

You could test it to some extent by claiming that the GDP-coffee correlation and the GDP-researcher correlation are both stronger than the coffee-researcher correlation. If that claim held up, it would pretty much kill any suggestion that there's a direct causal coffee-researcher link, if anyone was silly enough to make one.

Confirming a causal link - unless you have plenty of evidence of what happens when coffee supplies dramatically change while the background of other possible factors is very stable - is virtually impossible.

This frees up the communities of political, social and economic scientists to harbour lots of strongly-held and mutually-contradictory beliefs about the most fundamental causal relations, such as whether tightening all government spending tends to increase or decrease a deficit. They're encouraged to appear very sure of themselves if they want to be taken seriously, which amounts to a very effective way of selecting for the deluded.

I'm sure some of that happens in physics too, but I'd expect it to be a lot less. That's my causal hypothesis, anyway! If someone gave me some coffee, I'd follow it up.

johnduffieldblog said...

Oh for God's sake, this is really silly.

Coffee has got nothing to do with productive scientific research. Zip, zilch, zero.

It's nicotine that does it. We lived in rags and huts until we discovered tobacco. Shortly thereafter we had cars and planes and skyscrapers, and e-lec-tri-city.

:)

Anonymous Snowboarder said...

Bee - I'm disappointed.. why no log/log plot?

Sabine Hossenfelder said...

Because I hate excel! Just typing the word causes me mental pain. I think it's because I associate it with university administration, something I normally try to avoid contact with at all means.