Saturday, November 15, 2008

It's a man's world...

I recently stumbled across the GenderAnalyzer, a text classifier that according to their description “has been trained on blogs written by men and women” and “uses Artificial Intelligence to determine if a homepage is written by a man or woman”.

So, I piped in some links from my blogroll. Here are the results, sorted by male-ness:


And here is the absolute high-score of male-ness:

Yep.

Sorry Christine.

Having come so far I concluded that writing about science generally must count as an indicator for male-ness, and I discarded the 'GenderAnalyzer' as crap.

My homepage btw, is "quite gender neutral".

32 comments:

Christine said...

Evidently, the analyzer needs... calibration.

But, for the record, maybe it would be safe to declare that I am a woman, married to a man, having with him one children. :)

BTW, have you noticed that (as far as I can tell) there is only one woman in the FQXi essay contest, among 50+ participants so far?

Of course, you can infer who that woman is.

Best,
Christine

Bee said...

Hey Christine,

Good luck with the contest! I hope they have another one next year, maybe with a topic that I can better relate to. Best,

B.

Christine said...

Thanks! Well, I hoped to see your essay there. You still have... time.

Best,
Christine
PS- After a moment's thought, it must have been the "Garrett Lisi and Jacques Distler debates" that I have copied and pasted there. It must have been them... It must have been them...[tears and drama follows with a desolate violin in the background...] :)

Bee said...

Humm. Will see if could cook together my fuzzy thoughts on the nature of time to something essay-like. I am probably way too conservative for these folks. As far as I am concerned time is a dimension, and it's fundamentally so, period.

Maybe set the background color to pink and try again ;-)

Christine said...

As far as I am concerned time is a dimension, and it's fundamentally so, period.

Well, keep that one for the "shortest definition of time" contest. The present one you should elaborate more (but in less than 10 pages or 5.000 words...) :)

Maybe set the background color to pink and try again ;-)

I refuse! I refuse!

Kea said...

Well, I'm 72% likely female, according to the analyser.

SoloGen said...

This is not so unnatural. I guess they assign maleness/femaleness probabilities to a collection of words in their database, and then compute the likelihood score.
And if those specific words have been usually used by men, which is really not a surprise, their scores would be just off.

Giotis said...

"I am probably way too conservative for these folks."

Bee no! Nowadays radical is mainstream and the conservative approach is revolutionary. So do file your essay, you have many chances of winning this.

tpe said...

Hmm. I'm only 61% male, it seems, which is rather shattering. I was going to come in here and kick over your furniture and act extravagantly tough and sweary. Now I find that I have to act at least 39% reasonable. Nightmare. And so, whilst I can't exactly raise my knuckles to shake your hand, I'm surely able to say "hello, toptastic blog, greetings from Ireland" and "grunt".

Gender Analyzer is crushing. I'm going to lift some weights.

Kind regards etc...

TPE

Arun said...

Their little poll on the right shows them correct/incorrect 55%/45%. On 8000+ tries. So somewhat better than chance.

Dr Who said...

The FQXi competition is a bit of a disaster. We have everything from deadly serious high-powered essays from Claus Kiefer all the way down to

"Making Time With Pretty Girls and Hot Stoves by Don Limuti "

*Not* the kind of exercise that will persuade people that this subject ought to be taken seriously.

Bee, I agree that time is just another dimension. But you could still write about why it *seems* to be otherwise....

Rae Ann said...

It's not just science blogs. My result:

"We think http://viciousmomma.blogspot.com is written by a man (86%)."

Funny that they find me more manly than many of male scientists they analyzed. Very weird.

Andrei Kirilyuk said...

Christine said: “Evidently, the analyzer needs... calibration. But, for the record, maybe it would be safe to declare that I am a woman, married to a man, having with him one children. :)” Too little to prove you're a true woman :). For that you'd need to produce two and a fraction of them :).

But the analyzer does not pretend to determine your physical sex (children, etc.), it only pretends to determine your mental properties as they may show up only within a given (limited) text. There may be a huge difference between the two. The machine is very imperfect, of course, as it seems to ignore the “semantic dynamics” of a text that easily reveals a (mental) woman or man. But it may be partially “right”: women working as professional scientists should accept so-called “men's” (but actually universal, when it's correct) logic. It can be true only up to the applied analysis sophistication, but if it was true, the results could have something to do with your professional aptitude estimate (where you obtain thus a very good mark), while assuming, of course, that the analysed text (e.g. a blog) properly reflects it.

And then, who knows, maybe our dear sisters are not really women any more?! All humans are really brothers, now?! And the damn machine has occasionally revealed this hidden transformation?! Let's stop the horror scenario here...

As concerns your time essay for FQXi contest, I have looked it through and found quite appealing and correlating with my own rigorous results on the subject. The latter are not really suitable for an “essay”, as they are too close to a technically elaborated science, with “formulas”, etc. However, you (and everyone else) may be interested to know that “real”, irreversibly flowing time origin you philosophically “guess” as “concurrence” phenomena to be further specified, as well as relation to the nature of inertia you emphasise, have been clearly, mathematically and physically specified at least since 1997, just along the lines of your general expectations, and shown to be efficient for existing old and new problem solution (see also http://arXiv.org/abs/physics/0401164 and http://arxiv.org/abs/physics/0601140 for recent reviews of related microphysics results, including “Bergsonian” time origin and problem solutions). I naturally don't want to impose anything to anybody, but it just looks a little strange, this kind of “concurrence”, when the same results coexist, often for years, as clearly, rigorously specified and problem-solving ones (but never referred to), on one hand, and “ideologically” very similar but only vague, “philosophical” guesses, on the other hand (another example is everywhere repeated statements that universal and consistent complexity definition does not exist, while it does exist at least for a decade and is just very closely related to the universal and consistent definition of real physical time!). A “proto-science” stepping over (same kind of) “science”, so to say, without any interaction attempt... Innumerable “essays” and whole top-level conferences about time, without anything more than strongly deficient physics and maybe right but vague philosophy... Never mind, I do like your essay and hope it will win over the majority of more abstract approaches to the time problem.

As to some “conservative” opinions here (by Bee and others) that “time is a dimension”, come on, and where does it “flow”, that dimension, especially when you “don't have enough time”?! Just step back in that “dimension”, if you can, to prove it's but a dimension! Or, better still, step forward and understand that a beating, vibrating one (that's why that guy is talking about “pretty girls”, Dr. Who!) can be everything but a mere dimension (why don't we, Brazilians, ever need to be persuaded of it, ah, Christine?). A strange, crazy dimension, behaving as it wants itself, inhomogeneously, randomly, unstoppably (according to emerging events)... No, Christine, you wouldn't need to make any final concession in your essay to “nonexistent time” promoters, even in the name of competition. Be a man, here too (especially here!), as your “rating” indicates and fight to the end to defend your true conviction! And if, in addition, you could refer to a rather unique similar approach and its multiply confirmed, easily accessible results specifying your assumptions, that would even be a fair-man or woman, universally fair - and intelligent - behaviour! (Take it easy, of course, I am just ... a man, imperfect and subjective...)

Cheers, girls and boys, don't be just calculating and competing machines, open your mind and make a jump in real time to the real world, which is larger, ever larger than you would expect within any mechanistic thinking!

Phil Warnell said...

Hi Rae Ann,

“It's not just science blogs. My result: "We think http://viciousmomma.blogspot.com is written by a man (86%)."

Well except being outdone by the FemaleScienceProfessor I seem to be less male then the rest on this list including your own with them reporting:

“We think What is Einstein's Moon? is written by a man (73%).”

In as I’m also not a scientist at least I didn’t suffer the bias either. I guess then perhaps it judged mine as being a little more touchy feely then yours :-)

Best.

Phil

Phil Warnell said...

Hi Bee,

It would be interesting to learn what the result would be if Stefan’s contributions were omitted from the analysis?

Best,

Phil

Bee said...

Well, if you take my homepage as an indicator:

"We guess http://www.prime-spot.de/ is written by a woman (52%), however it's quite gender neutral."

And for reasons that are somewhat a mystery to me

"We think http://www.lightconeinstitute.org/ is written by a woman (61%)."

Bee said...

Hi Sologen,

Which would mean that 'text classification' amounts to merely word count. Meaning what they actually test is which gender is likely to write about what topic. As far as I am concerned I would have thought the first thing to look for is whether the blog owner maybe has a name! Or maybe just stated his/her gender. Multiplied with probability that people lie on these issues (which I guess very few do). Best,

B.

stefan said...

Hi Phil,

It would be interesting to learn what the result would be if Stefan’s contributions were omitted from the analysis?

I don't think that I contribute enough to significantly influence the result ;-)

Cheers, Stefan

Phil Warnell said...

Hi Bee,

"We think http://www.lightconeinstitute.org/ is written by a woman (61%)."

So much for artificial intelligence :-)

Best,

Phil

Bee said...

Dear Arun,

It would be interesting to know whether the poll-result is used for the algorithm's learning curve, and if so, how. Best,

B.

Phil Warnell said...

Hi Stefan,

“I don't think that I contribute enough to significantly influence the result ;-)”

You’re far too modest. Come to think of it I’ve never seen your own home page listed.

Best,

Phil

Bee said...

*lol*

"We have strong indicators that Stefan's homepage is written by a man (100%)."

Now you know what to do: Black background, unmotivated boxes hanging on the front page, no introductions to anything, use words like 'ultra-relativistic' and 'dynamical clustering'.

Phil Warnell said...

Hi Bee,

"We have strong indicators that Stefan's homepage is written by a man (100%)."

Well in terms of mate selection your own analysis was spot on:-) I wonder if this would rank higher then some sports blogs? :-) I'm going to have to ask Stefan to give me pointers as how to improve my ranking.

Best,

Phil

Phil Warnell said...

Hi Bee,

Just as a Postscript:

As I've calculated the mean between your site and Stefan's is 73% male. With BackReaction scoring 74% that would be well within an accepted deviation :-)

Best,

Phil

RGB said...

Fun exercise!

It would be interesting to know how many different people they had trained the software on. And what sort of occupational distribution they had. For example Bee/Christine's statement about more 'scientific' words being interpreted as being written by a male, just imagine how the program would respond, if the training sets included only women scientists and male artists!

Given that all of us agree that there are male-female ratio in science is skewed, a small number effect would cause 'scientific' blogs to be attributed to men,

Plato said...

Dr. Who:"Making Time With Pretty Girls and Hot Stoves by Don Limuti "

I thought I might try to enter, and found it only for scientists.

As subjective as this may sound some do believe Einstein's view does "carry weight?":) If you know what I mean, and not entirely on a pretty girl scale.

Never tried my own blog site, but decided not too. It seems it's not a very good measure of anything, especially of gender?:) Far to complex to measure?:)

Best,

Arun said...

The root is here:
http://blog.uclassify.com/gender-text-analysis/

Another post there says that when their servers were unable to keep up with demand is when the results went down for the gender classification.

Neil' said...

Well, considering Lumo's right-wing, skeptical, perhaps sexist (you know, Bee!) and aggressively one-upping sense of competition, I am not surprised at this result:

We have strong indicators that http://motls.blogspot.com/ is written by a man (92%).

Neil' said...

BTW, as many have noted the analyzing program doesn't really look into "meaning" and attitude-type markers. (Well I don't have enough "too much time on my hands" to really read about how it works, I just assume, not basics, and follow from others.) Hence Bee's interest in social justice and World cooperation (got that right?) does not overcome the apparent "maleness" of her simplistically analyze writing style. But I was not surprised for Lumo to score so high, because of the high "maleness" FWIW of his way of presenting himself and issues.

What I really want to see: ratings of Obama, McCain, Sarah Barracuda, etc!

philramble said...

My female friend's blog (largely on music) turned out as being 80% male, while mine (I'm male and blog on music, philosophy, technology, science and other subjects) turned out 98% male. We just laughed it off after trying a few URLs. :)

Phil Warnell said...

Hi Arun,

This turns out to be an interesting site in other respects. In poking around I found out it also had a writing style analyzer
and took the liberty to plug BackReaction into it to produce the following:

1. Jules Verne (36.5 %)
2. HG Wells (36.4 %)
3. Edgar Allen Poe (6.4 %)
4. Friedrich Wilhelm Nietzsche (4.9 %)
5. Plato (2.6 %)
6. Oscar Wilde (2.2 %)
7. Lewis Carroll (1.9 %)
8. Sir Arthur Conan Doyle (1.7 %)
9. Edgar Rice Burroughs (1.5 %)
10. Frank Baum (1.4 %)
11. Mark Twain (1.3 %)
12. Leo Tolstoy (1.3 %)
13. Johann Wolfgang von Goethe (0.8 %)
14. Charles Dickens (0.3 %)
15. Charles Darwin (0.3 %)
16. Charlotte Bronte (0.2 %)
17. Leonardo da Vinci (0.2 %)
18. William Shakespeare (0.1 %)
19. Jane Austen (0.1 %)
20. Dante Alighieri (0.0 %)
21. Homer (0.0 %)

Perhaps this could form the basis for a whole new discussion around these results:-)

Best,

Phil

andy.s said...

It sucks being a statistical outlier, doesn't it?

I wonder what you would conclude if you looked at the data through the other end of the telescope?

I.e., if you tallied the sex of the people who read your blog, would it come out closer to your score?

I remember reading once, that the audience of Discover/Scientific American and other science magazines is mostly male.

Perhaps these guys are actually giving you useful audience information without knowing it.