Sabine Hossenfelder: Backreaction: Learning to deal with information

Thursday, May 27, 2010

Learning to deal with information

In the 21st century, information is cheap. Or is it? I have written several times on this blog that it is a naive illusion to think of the internet as a democratic provider of information. Moreover, the simple provision of information is not equivalent to people being well informed.

The availability of information on the internet is not democratic but, if anything, anarcho-capitalist. If you have the money to pay people who know something about search engine optimization, and others to spam links to your site wherever they won't be immediately deleted, you can pimp up your website's ranking dramatically. Even Google's PageRank algorithm itself is clearly not democratic: it gives more weight to a link from a site that has itself more links. That's what makes it so powerful and so useful. Sure, we all profit from this clever ordering of information. I'm certainly not complaining about it. It's just not democratic and shouldn't be sold as democratic since not everybody's voice has the same weight. Google itself does smartly not call the algorithm itself "democratic" but writes that "PageRank relies on the uniquely democratic nature of the web." Ohm, which democratic nature are we talking about again? But maybe more important, Google's PageRank also doesn't tell you anything about the quality of information you obtain. That the voices of the wealthy have more impact is hardly surprising, and merely a reflection of what has been going on in the media and news press for a long time.

Now one could of course argue that it's up to me to just go through all the hits that my search brought up and find the best piece of information. But as a matter of fact, most people don't do that. I usually don't do it either. And that's not even irrational, because scanning through all the hits that one gets on a query is very time-intensive and the result rarely justifies the effort. Thus, most people will skim maybe the first 20 hits, if at all, and conclude that they've gotten a fair cross-section of what there is to know about the topic. That's the part of the information that is "cheap." Everything else, for example checking sources, becomes increasingly costly in terms of time and effort. And since most websites don't list their sources, there's few shortcuts to that. What is left is that whoever dominates the "cheap" information does, for all practical purposes, dominate the information market. The only cure for that is information literacy.

The other day, I read an interesting article by Mark Moran. Moran is CEO of a Web publisher that offers free content and tools that teach students how to use the Web effectively. He writes:

"[A]s the founder of a company whose mission is to teach the effective use of the Internet, I have pored through dozens of studies, and recently oversaw one myself, that all came to the same conclusion: Students do not know how to find or evaluate the information they need on the Internet.

In a recent study of fifth grade students in the Netherlands, most never questioned the credibility of a Web site, even though they had just completed a course on information literacy. When my company asked 300 school students how they searched, nearly half answered: "I type a question." When we asked how students knew if a site was credible, the most common answers were "if it sounds good" or "if it has the information I need." Equally dismal was their widespread failure to check a source’s date, author or citations."

I find this seriously scary! As I have expressed in my earlier post Cast Away, the passing on of knowledge to the next generation is one of the most essential ingredients to continuing progress. How are people supposed to make informed decisions if they can't tell what the relevant information is to begin with? Where does that leave our political systems? But then I read the following:

"Every day, we are inundated with vast amounts of information. A 24-hour news cycle and thousands of global television and radio networks, coupled with an immense array of online resources, have challenged our long-held perceptions of information management. Rather than merely possessing data, we must also learn the skills necessary to acquire, collate, and evaluate information for any situation. This new type of literacy also requires competency with communication technologies, including computers and mobile devices that can help in our day-to-day decisionmaking. [...]

Though we may know how to find the information we need, we must also know how to evaluate it. Over the past decade, we have seen a crisis of authenticity emerge. We now live in a world where anyone can publish an opinion or perspective, whether true or not, and have that opinion amplified within the information marketplace."

Wise words, eh? Guess where that's from? Guess, don't Google! It's a press release from the White House. No, really. It's an announcement for the "National Information Literacy Awareness Month" that was last year in October, which somehow passed me by. While recognizing a problem isn't the same as solving it, it is certainly a good first step. Let's hope that other nations will follow that example, there's clearly hope. Yes, we can do it! Indeed, there is more hopeful news today: The Pew Research Center's Project for Excellence in Journalism some days ago published new data comparing the news coverage on blogs to that in the traditional press. Here's an interesting number: only 2% of news in the traditional press are about science and technology. But on the blogs, it's 18%.

23 comments:

Arun12:10 PM, May 27, 2010
Dear Bee,

Someone will next invent an information rating service. E.g., you do your google search and pipe its results into your favorite information rating service, and it provides a rank possibly different from google, based on the expertise built into the rating service.

Then of course, there will be a proliferation of such services, especially for things of common interest, e.g., health and diet, and so there will be rating services for information rating services.

Ultimately someone will build the recursive rating service and suddenly the Web will become sentient :)

Best,
-Arun
ReplyDelete
Replies
Sabine Hossenfelder12:23 PM, May 27, 2010
Dear Arun,

I have had very similar thoughts, but I'm still waiting for it to happen. The question is of course, where does the expertise come from? Best,

B.
ReplyDelete
Replies
Cristi Stoica2:42 PM, May 27, 2010
Dear Bee,

there was a time when newbies become famous and rich overnight with amateurish websites, competing with giants with long traditions. An era of inflation, when money was created out of nothing, by SEO techniques, before the dot-com bubble, and before Google started dealing the cards.

Google's PageRank not only helped to create a more relevant ranking of web pages. It also transferred the real world "mass" of corporations to their correspondent in the virtual world. This may helped the dot-com bubble to stop, and it was so convenient for the old-tradition corporations, which were created by hard work and forced to compete with cheaply improvised websites. (That's my "conspiracy theory": if Google did not exist, the corporations would create it to regain their "mass" and sweep away the competition with less financial resources. Lucky for them, Google was created, becoming the last garage-based successful business.).

The idea behind PageRank is very similar to what we do when we select what to read or think about, or what to believe. The sin of trusting too much Google's results is similar to the sin of trusting too much the authorities of the day, the citations, recommendations, keywords, credentials. There is simply no time to check and understand everything, even for scientists. Various research trends are so disconnected, and often know about each other only half-truths ("here be dragons").

Best regards,
Cristi
ReplyDelete
Replies
PlatoHagel2:54 PM, May 27, 2010
Page ranking source within Backreaction based on word "information."

I was looking for something specific, in terms of processing. Guess which one? :)

#
Backreaction: The Black Hole Information Loss Paradox
24 Jun 2008 ... OK, you might say, what's the point of this if no information can escape ... When the black hole evaporates, if the information has not been ...
backreaction.blogspot.com/2008/.../black-hole-information-loss-paradox.html
#
Backreaction: The Information Triangle
28 Oct 2007 ... "Information" assumes an empirical source, a neutral conduit, and a capable destination. The mob is long past understanding anything of its ...
backreaction.blogspot.com/2007/10/information-triangle.html

#
Backreaction: Black Holes and Information Loss
7 Feb 2010 ... To understand the black hole information loss problem you need one ... Black hole evaporation causes a loss of information because the ...
backreaction.blogspot.com/2010/02/black-holes-and-information-loss.html

#
Backreaction: Learning to deal with information
27 May 2010 ... In the 21st century, information is cheap. Or is it? I have written several times on this blog that it is a naive illusion to think of the ...
backreaction.blogspot.com/.../learning-to-deal-with-information.html - 4 hours ago

#
Backreaction: Information Overload
14 Jun 2008 ... Physics Blog, written by Sabine Hossenfelder and Stefan Scherer.
backreaction.blogspot.com/2008/06/information-overload.html

#
Backreaction: Top Ten
4 Jul 2006 ... 2) Do black holes destroy information? ..... If my little sister can destroy information (and believe me she can) I'm guessing a black hole ...
backreaction.blogspot.com/2006/07/top-ten.html

#
Backreaction: Comet 17/P Holmes
31 Oct 2007 ... Thanks for this information. I hope to see it if the sky is clear tonight. At 11:38 AM, November 01, 2007, Anonymous Navneeth said. ...
backreaction.blogspot.com/2007/10/comet-17p-holmes.html

#
Backreaction: First Light for the Gran Telescopio Canarias
14 Jul 2007 ... And here you can find more information on it too, including what Queen ... You can find information in english at Grantecan.es the public ...
backreaction.blogspot.com/2007/07/first-light-for-gran-telescopio.html

#
Backreaction
23 May 2010 ... Otherwise you lose information that is possibly important about the range of applicability (information you at first possibly didn't think ...
backreaction.blogspot.com/

#
Backreaction: Indirect Detection of Gravitational Radiation
19 Dec 2007 ... Thanks for the information, Stefan. :). At 2:19 PM, December 19, 2007, Blogger Neil' said... I was browsing Penrose's latest big opus, ...
backreaction.blogspot.com/2007/12/indirect-detection-of-gravitational.html

Best,
ReplyDelete
Replies
Uncle Al2:56 PM, May 27, 2010
A skilled and motivated polity capable of informed critical thought is monstrous. Hegemony is rule unfettered by facts other than those born as required. Would you end the Space Scuttle, social advocacy, vest pocket wars, Homeland Severity, drug and vice enforcement, juryless tax courts... entire cultures founded upon fear, privation, baksheesh, and stukachi?

The annual California Academic Performance Index tells us that 60% of Los Angeles high school students (40% dropouts) testably snug borderline mental retardation. The 60% and the 40% are the Chosen Ones.

"Orthodoxy means not thinking — not needing to think. Orthodoxy is unconsciousness." 1984
ReplyDelete
Replies
PlatoHagel3:04 PM, May 27, 2010
Okay, you might of thought Information Triangle? I did. So let's go with that.

We understand that one has set the parameters within their own site as to the allocation of the term? They have a special affinity with the term?

In correlation, I saw a similar image, but from Penrose.

Information processing.

1.
Backreaction: The Information Triangle
28 Oct 2007 ... that's a very informative & informed information triangle, and thank you that you tell us where we are on this map ;-)... ehm, but I do not ...
backreaction.blogspot.com/2007/10/information-triangle.html

2.
Backreaction: The Unitary Triangle
24 Dec 2007 ... My husband and I, we both agreed the Unitary Triangle above is the .... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/2007/12/unitary-triangle.html

3.
Backreaction: Learning to deal with information
27 May 2010 ... In the 21st century, information is cheap. Or is it? .... The Information Triangle. The Illusion of Knowledge · Cast Away · The Spirits that ...
backreaction.blogspot.com/.../learning-to-deal-with-information.html - 4 hours ago

4.
Backreaction: Science, Writers, and the Public - A bizarre love ...
16 Oct 2009 ... Your information exists in part as a totality of the experiment? ...... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/.../science-writers-and-public-bizarre-love.html

5.
Backreaction: Indirect Detection of Gravitational Radiation
19 Dec 2007 ... Thanks for the information, Stefan. :). At 2:19 PM, December 19, 2007, .... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/2007/12/indirect-detection-of-gravitational.html

6.
Backreaction: Elegant proofs
11 Apr 2008 ... It comes out that this triangle, because of the rules chosen for .... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/2008/04/elegant-proofs.html

7.
Backreaction: Comet 17/P Holmes
31 Oct 2007 ... Thanks for this information. I hope to see it if the sky is clear tonight. .... This and That (14); The Information Triangle (18) ...
backreaction.blogspot.com/2007/10/comet-17p-holmes.html

8.
Backreaction: This and That
That is to say that I see information as the ordering or things, as opposed to the .... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/2009/01/this-and-that_20.html

9.
Backreaction: Filtering Gravity
23 Mar 2007 ... I understand that if we compress information into a zero 0 .... The Information Triangle. The Illusion of Knowledge · Cast Away ...
backreaction.blogspot.com/2007/03/filtering-gravity.html

10.
Backreaction: Science in the 21st Century
1 Feb 2008 ... Which refers to a graphic I had in an earlier post, the Information Triangle that I found a handy way to visualize these interrelations. ...
backreaction.blogspot.com/2008/02/science-in-21st-century.html
ReplyDelete
Replies
PlatoHagel3:43 PM, May 27, 2010
Nice image

Penrose's image is nice too.

This process is based on a inductive/deductive assessment within the context of the information. It becomes self evident. Right or wrong, it sets the pace for the next generation.

What is Scientific Prediction

Best,
ReplyDelete
Replies
Sabine Hossenfelder4:09 PM, May 27, 2010
Hi Christi,

"The sin of trusting too much Google's results is similar to the sin of trusting too much the authorities of the day, the citations, recommendations, keywords, credentials. "

That's true and indeed very similar to what I was saying. However, the internet vastly amplifies any effects because of its much better connectivity and faster pulse. Basically, mistakes spread faster and they spread wider than ever before, a consequence of the underlying network. That's why the impact is much higher.

Here's a funny example about the spread of mistakes that I came across coincidentally. An alleged quotation by Einstein:

Er ist eine Skala der Proportionen, die das Schlechte schwierig und das Gute leicht macht.

You find this on Wikipedia and it has nicely propagated. Unfortunately, the German sentence doesn't make any sense. (The first word should be "Es," not "Er.") Ironically, the correct version doesn't bring up any hits. (Except one on the Wikipedia discussion page, which, as you can easily guess, was my comment.)

Okay, true, that's not exactly an example that makes a big difference either way, but it nicely illustrates what missing source checking can result in if errors propagate quickly and widely. Best,

B.
ReplyDelete
Replies
Arun6:13 PM, May 27, 2010
Hi Bee,

One could come up with a page rank algorithm different from Google, where items linked to by good sources are rated higher. One would need some semantic analysis to know whether the good source is confirming or criticizing the linked source.

One would set up one's good sources first, and since it takes human effort and expertise, it would almost certainly have to be limited to a specific domain. For example, let's say Number Theory. A small team would have to rate some set of internet sources on Number Theory and then let software propagate the rankings. Periodic review and re-rating will likely be necessary.

Let's call it a "propagation of authority" algorithm. Should be in the realm of the possible.

-Arun
ReplyDelete
Replies
Phil Warnell7:37 AM, May 28, 2010
Hi Bee,

You are indeed warranted to be concerned about how information is gathered and accessed. The real question being however, is to ask if this is primarily a process that’s best dealt with by simply applying metrics, which attempts to quantify things or rather a process that demands first that who is doing the gathering and assessment care that they have things understood right; which is more a qualitative process that leaves the metrics themselves needing to be accessed in terms of their relevancy,

For instance a fact so often quoted today is world population, yet what is the relevancy of such information without what serves to have it qualified. Other numbers these days often bandied about being how many trillions of dollars the global economy has shrunk and yet how often do you ever see reported what the total size the economy is to begin with or be reminded how much of this reflects what things are perceived to be worth, rather than being anything intrinsic to begin with.

So all I’m actually suggesting is what lies at the heart of all this is to ask how do we begin to have people pay attention to quality more as opposed to quantity which seems to be the primary thing that they are able to recognize? This might begin first to have people asked if it’s more important to know many things or rather better that what they think what they know is actually understood. In my everyday experience I access this first by recognizing if people are route learners or conceptual ones and by that measure I’m sorry to report the latter are in the extreme minority and would contend the primary reason is that so few people care to even recognize why to care as being important. That’s to first able as recognizing truth is not something had for the taking, yet rather something we must struggle to have understood.

Best,

Phil
ReplyDelete
Replies
Sabine Hossenfelder8:09 AM, May 28, 2010
Dear Arun,

I was indeed thinking of something very similar, though more simplistic. I thought that it would easily be possible to tag a link with a (hidden) attribute that tells why you're linking to this site. There might be different criteria, eg accuracy, or maybe you're linking there exactly because it's inaccurate! Or because it's funny, or you like the design, or whatever. Now the question is which of such attributes propagate like the PageRank presupposes. I mean, PageRank doesn't just calculate the popularity of a site by how many people think it's popular, but basically how many popular people think it's popular. So, there might be criteria that one can select this way and others not, depending on whether somebody who is thought to have X is more likely to be good in detecting X. That might be the case indeed for scientific accuracy. I would claim that people who are themselves more scientifically accurate are more easily able to recognize another scientifically accurate site, so this would work. I'm not sure it would work with humor or good design though. Best,

B.
ReplyDelete
Replies
Tim vB8:36 AM, May 28, 2010
Dear Bee,

I thought that it would easily be possible to tag a link with a (hidden) attribute that tells why you're linking to this site.

That's an idea that uses concepts of the "sematic web", I suppose you heard the buzzword before?

With regard to information processing:

1. I know many people who believe everything said in commercials. No, really. If a commercial says that a new shampoo contains active bacteria cultures and vitamin F and that this is good for your health, they will believe it (all have Abitur with degrees between 1 and 2).

2. There are many errors in educational books that are passed on to the next generation with great passion and patience by the teachers.

3. Watch Fox. Read some of the blogs associated to it's online article.

All of this could make you despair, but I am filled with confidece: See, they do not understand and God takes care of them nevertheless :-)

Does the internet up- or downgrade this problem? I do not know...
ReplyDelete
Replies
Kay zum Felde9:53 AM, May 28, 2010
Hi Bee,

I found it too scary, that people type in questions into Google to find something. How is it with scientific work ? You need to be an expert to value this kind of work. And you can find any kind of scientific (good and bad) work on the web.

Once I have red that chocolate is able to prevent people from cancer and that there has been made an examination with people about it. Of course no description of the set of the experiment. Only the 'news'.

Best, Kay
ReplyDelete
Replies
Steven Colyer10:05 AM, May 28, 2010
This comment has been removed by the author.
ReplyDelete
Replies
Steven Colyer10:06 AM, May 28, 2010
I forget which book of Hawking's it was, Time or Nutshell, but I think he made reference that in either 2008 or 2015, given the current rate of publication given the time when he wrote it, that the number of published scientific papers in Physics would equal and then begin to exceed one paper per minute.

And who has the time to read all that, even the abstracts?

Blogging was not an issue when Hawking wrote. Now it is.

TimvB, I would say both. The Internet has increased happiness and decreased it in almost equal measure. It has a capacity for great good and great evil, and the good and evil have both exploited it.

But at the end of the day, it still has capacity. Indeed, capacity is its very definition, yes?
ReplyDelete
Replies
Sabine Hossenfelder10:26 AM, May 28, 2010
Hi Tim vB,

Yes, I've heard of the semantic web. What I'm saying is to couple the tagging to an intelligent order-mechanism, similar to what PageRank does.

"All of this could make you despair, but I am filled with confidece: See, they do not understand and God takes care of them nevertheless :-)

Does the internet up- or downgrade this problem? I do not know..."

Well, I'm an atheist, I believe that we're responsible ourselves to take care of our lives. And I do think that if there's one thread to the well-being of mankind, then it's stupidity. The internet itself is a technology that can either work for or against us, depending on how smartly we use it. Just sitting around and expecting it to work for us without using our brains is not a very promising procedure.

There is good reason to believe that the internet amplifies many problems, especially those connected to misinformation. To that end, it is not necessary that the wrong information is actually the only information. It is sufficient already if people have the impression that a wrong argument has the same weight as the correct one. Such a wrong interpretation can easily be caused by search engine results. It is a well known fact in psychology that people regard a piece of information more likely the more people (they think) believe it. (In fact, they even think it more likely if the same person repeats it.) The internet vastly amplifies any such effect, and it also streamlines opinions. Where previously people might have exchanged opinions more or less locally, now we're all strongly coupled.

On the other hand, the internet is of course also an opportunity to overcome such problems and improve matters. But for that we have to think carefully about the relevance of the ordering, structure, and communication of information and its impact on our civilizations. Best,

B.
ReplyDelete
Replies
Sabine Hossenfelder10:43 AM, May 28, 2010
Hi Kay,

Well, I too type questions into Google :-) In some cases it's actually useful. I do this for example when I'm uncertain about an English idiom or a grammar rule. (I don't care very much what's the "officially" correct version, I just want to know if people will understand it.) Other things where Google information is quite useful is travel tips, or generally any sort of information that you can take without a weighting. For example if you're ill and want to know what your symptoms could mean, you will usually find a pretty accurate listing. The problem is that the most prominent results will not be the most likely ones, but the most dramatic ones. Whatever your symptom, it's probably cancer, you have 5 months left to live. Unless that is, you crank up your daily chocolate intake ;-)

What is scary, as you say, is if you then look at the details. You will find plenty of handmade advice, that has been copied over and over and over again, and no source to be found anywhere. The worst are typically self-help forums. I guess that's because people who have been in a specific situation are likely to believe they are experts, so why would they have to look at results of controlled trials?

I am not entirely sure why this is, but I believe a big part of the reason is that the scientific articles are for most people inaccessible, either because they can't access the journal or because they can't understand the professional garble. I can well relate to that. So what most people rely on is some news article summarizing a research finding, or a blog summarizing the news article, and we all know how distorting that can be. It's somewhat like the children's game Chinese whispers (stille Post), always interesting what comes out in the end, but most of the time void of information. Best,

B.
ReplyDelete
Replies
Sabine Hossenfelder12:46 PM, May 28, 2010
Hi Steven,

Yes, that's exactly the question: who has the time to read all of that? Nobody. That's why unstructured information is useless information. The more information there is, the more important it becomes that one can find what one needs to know, or possibly find that it's just not known. Best,

B.
ReplyDelete
Replies
Sabine Hossenfelder3:04 PM, May 28, 2010
Hi Phil,

I don't think I indicated that I believe information literacy (or lack thereof) is best dealt with by metrics. That's certainly not the case. However, a dumb use of metrics certainly doesn't improve the situation, so it's an obvious first step to ask if not there's a smarter way to sort information, one that is more beneficial. Best,

B.
ReplyDelete
Replies
Phil Warnell7:56 AM, May 29, 2010
Hi Bee,

I didn’t mean that you were suggesting a metrified approach, yet merely to agree with you this is the way it has been approached up until now. What Google has done is transplant a model that has a quality control mechanism built in being academia which has no direct correlation with the world in general. The bottom line being there is not much other than popularity and consensus used to measure quality, which even in academia has become increasingly unreliable despite it having a quality mandate at its core. That’s why it unavoidably it comes down to the individuals both from the contribution and consumption side.

Oh yes I meant to comment on you commending of Obama’s recognition and addressing the problem, with saying that I at first was impressed. However, I still have difficulty with the Whitehouse’s lack of transparency in respect to its own practices. They have a blog on their site that doesn’t include a comment section, which I find to fly in the face of being so enlightened. I don’t expect that he should personally monitor it and respond, yet I think it important enough someone(s) does and that it should be allocated reasonable resource. That is after all, as he himself would insist it’s all about the information and the assuring of its quality. I honestly see nothing wrong with promoting such a dialogue and suggest in the end it would solve many more problems than it presents.

Best,

Phil.
ReplyDelete
Replies
Sabine Hossenfelder11:17 AM, May 29, 2010
Hi Phil,

If I was the White House, my blog wouldn't have comments either. From all the places where you can find misinformation, blog's comment sections are by far the worst. Best,

B.
ReplyDelete
Replies
Arun12:06 PM, June 01, 2010
Hi Bee, thought you might like this
http://theillusionofcertainty.com/Press%20and%20Reviews/CMAJ%20Article,%205-20-08.pdf

PDF File. A way to convey risk information to patients.
ReplyDelete
Replies
Sabine Hossenfelder5:15 AM, June 02, 2010
Thanks, will have a look!
ReplyDelete
Replies

Add comment

COMMENTS ON THIS BLOG ARE PERMANENTLY CLOSED. You can join the discussion on Patreon.

Pages

Thursday, May 27, 2010

Learning to deal with information

23 comments: