Welcome to the second in my ‘Incurably Sceptical’ series (see first post here). In this section I pick a paper from the cognitive psychology literature that appears interesting based on abstract alone. We then pick apart the author’s aims, methodology, analysis and interpretation, having perhaps just a little fun at their expense but hopefully also learning a few useful things about scientific method along the way. This time we will be looking at a paper entitled ‘Testing for Individual Differences in the Identification of Chemosignals for Fear and Happy: Phenotypic Super-Detectors, Detectors and Non-Detectors.’ [Link] Broadly, the aim of this paper was to examine the extent to which people can detect a person’s mood (fearful or happy) by smelling their under-arm sweat (stay tuned for more on the protocol for extraction).
Super Detector (/ recogniser) research is a popular trend at the moment, both academically and in the media. The idea is that there are some individuals in the population who, for whatever reason, have a ‘super’ ability for e.g. detecting flavours in wine / coffee, recognising faces, detecting minute changes in pitch / tone etc (in general, having an extremely heightened ability to detect similarities between two stimuli or patterns based on one or other of the senses). There have been many articles lately about super detectors being used by the police and private companies for all sorts of wonderful things (see: Are you a Super Recogniser? and ‘The Super-Recognisers of Scotland Yard’). Even the higher-quality reporting on the subject has raised my scepto-sense before – see for example this paper where a group of people with an average 93% accuracy for facial recognition (vs 80% in the general population) apparently deserve the title ‘Super Recognisers’. ‘Slightly Better Recogniser’ might be more appropriate. This is inevitable of course. When being ‘on trend’ increases your likelihood of getting published, the application of sensational category labels like ‘Super Detector’ to small group differences is to be expected. So, I would certainly describe myself as in a sceptical frame of mind when I first read the title of this paper. The abstract didn’t improve things with its mention of ‘implications for the further study of genetic differences’ despite there clearly being no actual genetic analyses in the study. Further, ‘dual processing’, another trendy term, was thrown in despite a lack of clear relevance. In short, this paper appeared to me, based on the abstract, to be tenuously ticking all the boxes publishers like to see, and when a paper’s doing that, I tend to worry that low quality work is being masked underneath. However, the abstract also said that “mood odors had been collected from 14 male donors during a mood induction task” and that 41 females had been asked to identify the mood odor chemosignals … so obviously I read on.
So, yes, onto this extraction method. Normally, I would paraphrase a method, but I enjoyed the tone of the write up of this one so much that below I reproduce (nearly) the whole section on extraction:
In this study the mood odors were collected from 14 healthy male undergraduate nonsmokers. For a 7-day period prior to the sample collection, the donors only used the provided odor-free deodorant and cleansing products. The donors were instructed to shower (using the soap provided) the morning of sample collection approximately 6 hours prior to sample collection. They were also given a list of prohibited “spicy” and other odorous foods and did not eat them during the 24 hours prior to the collection.
Axillary samples were collected during two video mood inductions, one day apart. The fear mood and happy mood induction videos were 12-minute standardized videos … The videos were shown twice to the subjects for a 24-minute induction. The videos have multiple facial displays for fear (or happy). There is no narrative theme. This reduces the likelihood that repetition of the 12 minute video would decrease the impact of the video.
Samples were collected onto cleaned Kerlix8 brand sterile gauze. Prior to mood induction, donors were given 2 pairs of gauze strips (each strip 3cm x 8cm) in separate plastic enclosed bags labeled “right” and “left” arm. They placed one pair in each left/right axilla. At 12 minutes into the mood induction, the film was paused and donors removed one pair (1 left and 1 right) of axillary pads. Donors placed each pad into its labeled plastic zipper bags. All air was forced from the bag prior to sealing. The second pair of pads was collected in the same manner after 24 minutes. All samples were placed in a minus 80C° freezer within 2 hours of collection.
So, yes, an unusually invasive and controlling set of requirements for these healthy male undergraduates. They don’t report the incentive for them to take part in the study, or if they were paid more or less than the people who had to smell their sweat – tough call that one. In terms of the smelling protocol, “Participants (detectors) were tested individually in dedicated testing rooms approximately 8’x8′” (I’m not sure why they included the room size here, but perhaps if you know a lot about smelling, this is quite important for understanding the … diffusion dynamics or something). Then:
On each trial the experimenter placed five identical sample jars from one set of donors on a plastic tray on the table, shuffled them, and presented the tray of jars to the detector. S/he was instructed to sniff the jars as many times as necessary and in any order. The detector identified the odors by setting each jar on its label [e.g. fear, happy, control] on a place-mat.
I imagine this scene a bit like a gross version of the ball and cup trick (‘Keep your eyes on the fear-sweat jar’). The offer of unlimited sniffs was very generous though. Anyway … despite its amusing elements, on reading the methodology I was struck by how well controlled it was e.g. “To avoid position effects, half of the detectors had fear labels on the left side of the place mat and half of the detectors had them on the right side of the place mat.” There were lots of neat little controls built into the study like this to ensure the results weren’t biased and overall I was impressed by the attention to detail.
So, they extracted sweat during fear-inducing or happy-inducing videos, then got people to sniff fear, happy, and control (no video) sweat to see if they could correctly label them. Simple enough. Now, what did they find? “The first analysis (rtt) showed that the population was phenotypically heterogeneous, not homogeneous, in identification accuracy.” This sentence annoyed me, I must admit. The use of the word phenotypically implies an important distinction is being made between the participants’ genotype (their set of genes) and their phenotype (their body). But there’s no genetic testing in this paper, so the distinction is pointless – the word can just be deleted entirely without affecting the meaning of the sentence. And heterogeneous? All that means is that every single individual in the sample weren’t all equally good at smelling – the addition of these words seem just to serve to add a ‘sciencey’ sounding feel to the paper. If you’re wondering why I’ve gone on about this for so long, well yes, it’s a pet peeve of mine. Flowery, jargon-rich scientific
writing usually hides a lack of competence and knowledge, rather than demonstrating it. It also serves to alienate lay people and even scientists from other disciplines. It is exactly the kind of writing I expected from reading the abstract, with its unnecessary use of trendy terms. In truth it actually isn’t a bad paper underneath, despite my expectations, so I all the more wish they could have stuck to a more concise, less show-offy (that’s not jargon, just a made up word, by the way) reporting style.
I think what they were actually trying to convey with this sentence was that their participants’ smelling ability, rather than being a smooth spectrum from rubbish to good, was broken up into well-defined groups. Indeed, around 49% were deemed to be super-detectors, who had around a 75% accuracy rating by the final trial. 33% were just ‘detector’s (around 40% accuracy on the final trial) and around 18% were ‘non-detectors’ with 0% accuracy. Now, as I briefly outlined earlier, this concept of super detectors rests on the idea that there is a proportion of the population who have an unusually heightened ability. Any definition you look up of ‘super’ is likely to include the words ‘particularly’, ‘especially’, ‘unusually’, etc. This makes it a peculiar term to apply to the largest group (half the sample!) This is the majority, not some niche elite … and here again we arrive at issues with, not the underlying paper itself (the statistical analysis is actually as far as I can tell excellent) but with sensationalism and dressing-up in the write up. These authors used the term ‘super-detectors’, despite the ludicrous fact that their ‘super’ group was half the sample. The only reason for this can be that it is an eye catching term and increases their chances of getting published. Sigh. There are no 100% objective scientists. They are all just regular people who need to further their careers. 15 year old me would be very depressed.