30 years of science: Expressions of certainty // Cogsci

When writing a scientific paper it is considered good style to convey an absolute and unrelenting trust in your own findings. So in many papers the discussion section starts with something like 'the present study is the first to conclusively show that (...)' or 'the results clearly show that (...)'.

I've written a few lines like that as well. But, tainted with hypocrisy, I actually find this style of writing a bit weird. It is no secret that cognitive science is a messy business, and that experimental data is seldom clear-cut. Most scientists, and certainly the good ones, are quite frank about this. So why the sudden attack of confidence when writing a paper?

A random academic paper. Well... I don't know, and perhaps it is just a matter of style without any real reason. Fashion, in a sense. But it does strike me that this is a relatively new phenomenon, and that scientists in the past were much more equivocal in their writing. The quite recent past, even. Take, for example, Michael Posner, who wrote in the abstract of his seminal 1980 paper Orienting of Attention that '(...) the possibility is explored that (...)'. Or Giacomo Rizzolatti, of mirror neuron fame, who wrote back in 1987 that '(...) the hypothesis is proposed that postulates (...)'. Both of these sentences (which were of course cherry-picked for the occasion) convey a modest degree of belief in ones own theory and/ or findings: I believe in X, but I could be wrong.

I thought it would be cool to analyse the PubMed data, which I described in a previous post, to see if there is indeed some evidence that 'certainty-writing' has become more prevalent over the years. The nice thing about this dataset is that it includes a lot of abstracts (short summaries of the papers). So, by data mining these abstracts, you can count how often particular words are used. (In case you're interested, the top five in descending order consists of 'neurons', 'cells', 'results', 'both', and 'during'. Excluding the 100 most common words, like 'the' and 'and', of course.)

So to see how certainty-writing has evolved over time I did the following: I determined the relative frequency of a number of common certainty keywords ('certainly', 'show', etc.) and uncertainty keywords ('arguably', 'suggest', etc.) for every year since 1980 (some abstracts go back to 1974, but there's not enough data from before 1980 to do any useful analysis). I picked 9 words in each category, and more-or-less matched them: 'show' vs 'suggest', 'clear' vs 'possible', etc. In total, there were 25.474.335 words (all words, not just the keywords) across 237.794 abstracts. The 'expressed certainty' in the graph below is simply the number of certainty keyword divided by the number of uncertainty keywords.

So, let's take a look!

Cool, right? It ~~clearly is~~ actually seems to be the case. In general, the uncertainty keywords dominate the abstracts, but there is appears to be a shift towards the certainty keywords. Whether this reflects a more general shift in language, or a specific shift in academic language, is difficult to say, of course. You would need to do a similar analysis of a corpus of, say, newspaper articles, and directly compare the results. But either way, I think it is quite interesting.

Stay tuned for more!

References

Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32(1), 3-25.

Rizzolatti, G., Riggio, L., Dascola, I., & Umiltá, C. (1987). Reorienting attention across the horizontal and vertical meridians: evidence in favor of a premotor theory of attention. Neuropsychologia, 25(1A), 31-40.