Friday, August 15, 2014

Guilty Brains

In a well-known paper on the role of neuroscience in the court, law professor of Stephen J. Morse opens with the following paragraph:
Brains do not commit crimes; people commit crimes. This conclusion should be self-evident, but, infected and inflamed by stunning advances in our understanding of the brain, advocates all too often make moral and legal claims that the new neuroscience does not entail and cannot sustain. Particular brain findings are thought to lead inevitably to moral or legal conclusions. Brains are blamed for offenses; agency and responsibility disappear from the legal landscape.

         From: Brain Overclaim Syndrome and Criminal Responsibility: A Diagnostic Note
         Public Law and Legal Theory Research Paper Series Research Paper No. #06-35 

Professor Morse is of course punning on the famous line often used in defence of gun rights in America, namely, "Guns don't kill people, people do". This guns rights slogan has the virtue of at least being literally true. No, guns don't kill people; guns just make it enormously easier to do so. Guns are exceptionally efficient facilitators of death. Of course, as an argument in favor of gun rights, it's fatuous. For instance, you never hear anyone say: Atomic bombs don't annihilate populations, people do. Or, how about: weaponized Anthrax doesn't kill people, people do.

But lets return to Morse's line about brains. "Brains don't commit crimes, people do". I confess, I'm a neuroscientist and not a legal scholar, but something seems amiss about this one. Is there kind of logical fallacy lurking about in that line? If the fallacy doesn't jump out at you, maybe the following mini-vignettes will help clarify what I'm talking about.

What is a person? Maybe there is a legal definition, but lets forget that for a moment. Lets say for now that we know a person when we see one. Below are the stories of Joe, Mike, Jim, Jake, and Jack.

"Hey, look at that person over there. Yeah, the guy riding the bike. Nice person. Friendly fellow. His name's Joe." Joe has a head, he's got limbs, a body, he walks and talks, has a social security number, and he typically wears clothes in public. He holds a steady job. He's a citizen of country X.

Now look over there. It's Mike. Mike was in the war. Mike is in a wheelchair these days. He has a head, a body, but is missing some limbs. Everybody agrees Mike's a good guy. Mike is a good person.

Here's someone really interesting. A modern marvel of medicine. It's Jim. Jim has a head, he has limbs, and a body, but several years ago his heart failed him. His heart just stopped working. But he was lucky. He underwent a successful heart transplant surgery. Jim doesn't have his own heart anymore. Jim has Jake's heart. Jake was a good person too.  Jake was a good person, but Jake was a little odd. Jake, you might say, was nuts. Jake even believed that he was Jesus Christ, sometimes. Jake had a good heart, though. And now Jim has Jake's heart. But even though Jim has Jake's heart, Jim doesn't think he's Jesus Christ. Jim is just like Jim always was, except his ticker is in tip-top shape.

Jack is a bit of a sad case. Let me tell you his story. Jack was driving out by Corcoran State Prison one night and got in a bad accident. His body and limbs were unscathed in the crash, but his brain was all smashed up. Jack was in dire need of a new brain, he needed a brain transplant. Remarkably, doctors in the emergency room were able to find a fresh brain for Jack. But as a result of an extraordinary series of mix-ups, caused by inept record-keeping and carelessness on the part of medical staff, Jack didn't get the brain he was supposed to get (the brain of a 65 year old female, named Gwen) -- oh no. Jack didn't get that brain, Jack got Charles Manson's brain.

When "Jack" was released from the hospital, "Jack" felt a little sluggish, as one might expect after a brain transplant, but he could walk and talk and eat and drink and generally felt pretty good. Jack didn't feel himself, though. And Jack wasn't acting like himself, either. To the people who knew and loved Jack, Jack wasn't Jack anymore. For instance, when Jack's wife of 30 years, Nancy, welcomed him home for the first time after the brain operation, Jack stared at her with the eyes of a maniac and remarked to her: "I'm nobody. I'm a tramp, a bum, a hobo. I'm a boxcar and a jug of wine. And a straight razor ... if you get too close to me." Needless to say, this frightened Nancy.

By the time the error was detected at the hospital, Jack had already gone on a crime spree involving a series of unspeakably gruesome acts of depravity. What had gotten in to Jack? Everybody wondered whether it was the brain transplant that had caused this terrifying change in his behavior. Eventually, the police caught up with Jack and he was taken in on numerous charges including murder and kidnapping.

Meanwhile, the neurosurgeon that had performed the brain transplant convinced the authorities to let him perform another operation on Jack, to take out Manson's brain and replace it with the healthy female brain. Jack returned to the hospital and left several weeks later, in evident good health and spirits, equipped with a 65-year old female's brain, and expressing a newfound appreciation of Oprah Winfrey, knitting, and flower pattern dresses. Unfortunately, a judge's order sent Jack back to jail to await trial.

Jack was unhappy in jail and was growing depressed in dark solitary cell. He thought often of his husband, Martin, and how much he would love to return to his garden or just sit in the rocking chair and knit a sweater on a lazy Sunday. The public defender assigned to the case assured Jack that he had a plan. He told Jack:

-- "I know how to win this case. You didn't commit those crimes. It wasn't you. It was Charlie's brain".

-- "But they have witnesses.", said Jack. "I don't remember any of it. They say it was me, though. They all saw me. Blood was on my hands. They have my DNA. Everything points to me. I guess I did it. I'm doomed."

-- "Jack, that wasn't you!"

-- "My name isn't Jack, It's Gwendolyn. Call me Gwen, please!"

Jack sat silently in the courtroom as the trial went on, day after day, another witness describing in gory detail his horrific and senseless acts. Jack often welled up with tears as he listened to this disturbing litany, not remembering any of it, but increasingly resigned to the overwhelming probability of his guilt.

But Jack's lawyer had something extraordinary in store for everyone, an unprecedented legal gambit that he was sure would turn the tide. He called Jack to the stand. Jack testified that he didn't remember any of the crimes, that he had had never seen any of the victims whose pictures, one after the other, were presented to him.

-- "Ladies and gentleman of the jury. There is a reason that Jack doesn't remember any of this, that Jack doesn't recognize any of these faces. It's because Jack didn't commit these crimes. Jack's no murderer. It wasn't Jack, at all ...", pausing for effect: "Ladies and gentleman it was Charles Manson's brain!!!! The person who should be on trial, is resting quietly, awash in formaldehyde, in a 6-litre mason jar."

Everyone in the courtroom gasped. The prosecution objected vigorously: "There is no evidence whatsoever of Manson's brain's involvement, nothing. This is a last ditch effort to derail these grave proceedings by turning this trial into a bizarre farce!" The worldwide news media began covering the trial, referring to the bizarre ordeal as, simply, Mansonstein.

But Jack's lawyer was not done.

-- "Your Honor, I now call to the stand my final witness: Charles Manson's brain!"

-- "Objection!! Your Honor, the witness has not been announced, this is highly irregular -- brains do not commit crimes, people do! "

-- "Objection overruled, I will allow the witness. I am curious to see what Charles Manson's brain has to say."

At that point, Jack's neurosurgeon entered the courtroom carrying a semi-transparent glass jar with a human brain submerged in a yellow-tinged liquid. The surgeon carefully approached the stand, deposited the jar on the seat of the witness chair, and hooked up electrodes that were attached to the surface of the brain to an large electronic device that was itself attached to a pair of speakers."

Jack's lawyer began his questioning. "Mr. Manson, er ... Mr. Manson's brain, where were you the night of the 23rd of August?"

Manson's brain (heard through speakers): "You can't prove anything. There's nothing to prove. Every man judges himself. He knows what he is. You know what you are, as I know what I am, we all know what we are. That cat, Jack, knows what he knows, he's a real deal killing machine, I been locked up in this septic tank, my brain on ice."

Public Defender: "Have you ever seen the inside of Jack's skull?"

Manson's Brain: "Sure I been in there. Had a real good time, caught up with some old friends and played nasty tricks on people. Jack's skull treated me real nice! We had a good ole party and then we parted ways."

Public Defender: "So you committed the gruesome acts of violence attributed to Jack's person?"

Manson's Brain: "Yeah, I did done that, those things. Laddaddada!! I'm your Huckleberry, now get me out of this god-forsaken brain tank and let my spirit free again!!"

The courtroom is stunned. There is a hushed silence and then murmuring and then a isolated shouts and then a gathering chorus: "Free Jack! Free Jack! Jack is innocent, Jack is innocent!"

And the curtains close.

Tuesday, March 19, 2013

Reaction Time Experiments: Functional Neuroimaging on the Cheap

I my last post I discussed Max Coltheart's claim that functional neuroimaging has nothing to say when it comes to deciding between two psychological theories. I concluded that, really, you just have to concede the point -- but on very narrow grounds. Yes, if two theories do not make a prediction about a functional neuroimaging measure, then it is of course trivially true that fMRI cannot adjudicate between them. This is nothing specific to brain imaging. One can say, generally, that for any dependent measure to be useful in deciding between two scientific hypotheses, the hypotheses must make predictions about outcomes on that variable. I further concluded that there is no special reason why psychological theories cannot make predictions about functional neuroimaging data. Here I want to elaborate on that point, because, clearly, at least a few people believe that there is a reason to exclude functional neuroimaging data from having a say in psychological theory-testing.

In a recent talk at the Rotman Research Institute in Toronto, Russ Poldrack made the not wholly tongue-in-cheek remark that when it comes to testing psychological theories, you can think of functional neuroimaging as "like a really expensive reaction time experiment". This formulation works for me. And you know what? I like it even better when you flip it around, like this: A reaction time (RT) experiment is like a really cheap functional neuroimaging study. 

Hear me out. The logic of an RT experiment is simple. When a subject responds more slowly in one condition than another, we say that this condition required more "cognitive processing". More cognitive processing is more work. Who carries out that work? I think B.F. Skinner might even admit (if pressed) that it's the brain that carries out that work. Thus, RT is a measure of brain work -- RT is a measure of brain activity. It is no coincidence that the dominant analytic paradigm in functional neuroimaging over past 30 years -- "subtraction logic" -- was pilfered outright from the Donders/Sternberg approach to psychological experiments using RT.

It must be said on behalf of RT: pretty darn good temporal resolution, terrible spatial resolution. Indeed, with RT, you only get one voxel. (Even EEG guys scoff at the spatial resolution of RT-imaging experiments). Well, no, it's not a fancy technique. No glossy brochures, no booth at SFN, no sales divisionIt doesn't involve measuring brain waves, magnetic fields, electron spins, or anything like that. Nevertheless, make no mistake, RT is a neuroscience technique. And although RT experiments aren't really ever presented in the form of "images", they perfectly well could be. After all, EEG was around awhile before its untidy squiggles were repackaged as pretty color interpolated brain maps. Want to get your RT-Imaging experiment published in Neuroimage? How about something like this: statistical parametric mapping on RT data (SPM-RT). Homage to the square.

   Figure 1. SPM-RT Image, Lexical Decision: Frequency Contrast (High > Low) p < .01, uncorrected.

So if you present your RT "contrasts" (your t-tests, your ANOVAs, your regressions) in Josef Albers hues, then, yes, you're doing a very cost-effective version of brain imaging. (This is good news if you're a cognitive psychologist angling for that "cognitive neuroscience" faculty job: tout all your good work in one-voxel statistical parametric mapping of RT data; talk up your expertise in "RT-imaging"). In fact, you can probably even jerry-rig SPM to analyze your RT data and display it in that iron-hot color scale in the SPM glass brain. (Don't forget to turn off temporal smoothing and convolution with the HRF.) You too, devotee of RT methodology, can take advantage of the seductive allure of brain imaging.

But, but, but, but .... wait a second, you say! "RT is a behavioral measure, and fMRI is a brain measure. They're different!" This is something like a category error, I say. RT is a measure of elapsed time. It has zero scientific significance in and of itself. Its significance rests entirely in the interpretation given to it, the inferences made with it. And those inferences, as I have already mentioned, are about "mental effort", "cognitive processing", which are mere synonyms for "brain work", "brain activity". As a matter of fact, fMRI is only a little bit "closer to the metal" -- the measure is blood oxygenation and the inference is about neural activity in this or that region.

Adrien Owen and colleagues have shown that patients in a vegetative state who otherwise have no way of communicating with the outside world, can nevertheless participate in simple psychological experiments while undergoing fMRI scanning. To answer "yes", the patient imagines that he or she is playing tennis, and a particular network of brain areas lights up. Thus, the patient communicates his or her intentions by "thinking" in a certain way. Is this a behavior? I don't see why not. It just so happens that this behavior is witnessed by an expensive MRI scanner, rather than a cheap "button box". The point being that the distinction between brain and behavior isn't so cut-and-dried.

Lets return to the original question. Can functional neuroimaging data constrain psychological theory? Well, it remains the case that if a theory doesn't make a prediction about a particular outcome measure, then it's hard to see how that measure is going to be of much help testing your theory. But if you accept my argument that RT, the workhorse dependent variable of experimental psychology for the last 50 years, is itself a brain measure, then we see that the whole argument is something of a mirage, a false dichotomy, a hot-iron herring. On the bright side, does this mean we can finally merge cognitive neuroscience and cognitive psychology? Can we have a new field, a new annual meeting? I propose Cognitive Neuropsychognomics. Organizers, founders, wherever you are, I have only one request: hold the annual meeting mid-February in Kauai or somewhere else suitably tropical.

Tuesday, July 17, 2012

A trick question: What has functional neuroimaging told us about the mind?

I'm not actually going to try and answer the question posed in the title, which is taken from Coltheart's (2006) legendary critique of functional neuroimaging in a special issue of the journal Cortex. To summarize, Coltheart concluded that, no, functional neuroimaging hadn't told us anything about the mind so far; and he challenged others to prove him wrong. Others have taken the bait and made heroic and important efforts to meet Coltheart's challenge. Rather, I'm simply going to question the question. Because it's a tricky question. Indeed, it's a trick question.

This may seem obvious and elementary, but to answer Coltheart's question one first has to know what his question means. And the critical word in his question that we need to define is "mind". What has functional neuroimaging told us, he asks, about the mind.  As good reductionists, we might say: "wait, the mind is the brain, they denote the same object". The morning star is the evening star. The two are synonymous. So, substituting, "brain" for "mind", we rephrase the question as follows: "What has functional neuroimaging told us about the brain".  And then the answer is trivial, because novel information about the brain gotten from functional neuroimaging answers the challenge. Case closed.

If only it were so easy. In fact, when Coltheart uses the word "mind" he's not talking about the "brain". He's talking about something else. Is the mind a thing? Or is it an idea? Can we touch it? Can we define it?  Although Coltheart uses the word "mind" 11 times in his essay, he never actually provides a definition. I'm not going to try and define it, either.

If we look elsewhere in the paper for clues about what is meant by "the mind", however, we find that Coltheart is really concerned with psychological theories and the ability to adjudicate between them. But what are psychological theories about? 

From the first paragraph of the Introduction (emphasis added):

There are numerous different reasons for doing [functional neuroimaging]. I will consider only one of these reasons, namely, to try to learn more about cognition itself.

And then:

Although there exists a huge volume of recent literature reporting the results of cognitive neuroimaging studies, there are surprisingly few papers which have evaluated this technique as a way of studying cognition itself.

OK. We can infer that psychological theories are about cognition itself. And we can further infer that "cognition itself" is separate and conceptually distinct from the "brain". And so what is the "mind"? It's cognition itself. It's not the brain. It's apart from the brain -- it's itself.

Another hint comes further down on page 1 (emphasis in original):

My paper, like Henson’s, is concerned solely with the impact of functional neuroimaging on the evaluation of theories that are expressed solely at the psychological level.

So the mind is akin to cognition itself, and cognition itself is described solely at the psychological level. In other words, don't mix up the brain with cognition itself. Stay at the psychological level. Any reference to the brain in a pure psychological theory is verboten. That would be mixing levels. Mixing metaphors with molecules.

The question, then, is whether functional neuroimaging can adjudicate between competing psychological theories that are about cognition itself. And what sort of predictions do these (pure) psychological theories actually make? They make predictions about behavior. Coltheart uses as an example two theories of reading that posit serial or parallel processing, respectively. For instance, in serial processing:

When a word contains an irregular grapheme-phoneme correspondence, the later in that word that correspondence is the less the word’s reading-aloud latency will be affected by its irregularity.

Here, the dependent variable is a measure of response time, or latency, and whether it depends on spelling irregularity.  In all of Coltheart's examples of the predictions made by psychological theories the dependent variable is always a behavioral measure (i.e. reaction time, accuracy, etc.) and never a brain measure.

But that follows perfectly from Coltheart's stipulations. Theories are expressed at the psychological level. They don't make reference to the brain. And because they don't make reference to the brain, they don't make predictions about the brain.  And because they don't make predictions about the brain, ipso facto, functional neuroimaging cannot adjudicate between said theories.

Coltheart was right!

But Coltheart's conclusion is not an empirical one, based on an evaluation of the functional neuroimaging literature. It is simply axiomatic. He defines a psychological theory as that which does not refer to the brain ("expressed solely at the psychological level") and which makes predictions about variables (reaction time, accuracy) that cannot be measured by functional brain imaging. So we know, a priori, that functional neuroimaging cannot tell us anything about these particular psychological theories. And all the articles in the special issue of Cortex that attempted to meet Coltheart's challenge were doomed to failure, a priori, on the basis of a simple logical deduction.

Coltheart was right, but it was a trick question.

Suppose that we loosen up Coltheart's definition of psychological theories to allow for those that make contact with the brain? And if such theories make predictions about states of affairs in the brain, then, guess what, all of sudden functional neuroimaging can adjudicate between two competing psychological theories that make different predictions about brain activation.

And is there any reason a psychological theory, other than for reasons of ideological purity, should not make contact with brain? Or is that some sort of contradiction? Can psychology be mixed with neuroscience? Isn't that what's called ..... cognitive neuroscience?

My answer is, of course, that "psychological theories" can certainly make reference to the brain. Indeed, such theories can be fundamentally neuroscientific. They may borrow concepts and terminology from multiple traditions of inquiry, including cognitive psychology, neurology, neuroscience, psychiatry, neuroanatomy, genetics, physiology, sociology, economics, and so on.  Buchsbaum & D'Esposito (2008) made a similar point in a book chapter a few years ago and I'll quote it before a brief conclusion:

A hypothetical philosopher of metaphysics might ask the question: ‘What has physics told us about metaphysics?’ to which he might answer that because metaphysics is the science of the non-physical, physics by definition has nothing to say about metaphysics. Unlike metaphysics and physics, however, most would agree that the study of the mind and the study of the brain are fundamentally related if not, indeed, one and the same endeavour. There is therefore absolutely no reason why psychological theories should not refer to and make explicit predictions about brain function, nor is there any reason to think such theories would, upon making contact with neuroscience, somehow cease to be ‘psychological’. 

To conclude, in 2006 Coltheart asked whether functional neuroimaging had told us anything about the mind. He concluded "no". There have been many earnest  and technically innovative efforts over the years to persuade he and other skeptics otherwise. Cognitive ontologies, Bayesian probabilistic reverse inference, forward inference, reverse inference, structure-function association, etc. etc. etc. All of this stuff is fantastic and welcome and is enormously useful to cognitive neuroscience. But none of it answers Coltheart's challenge, in the way it was framed. Because it's impossible. The game was rigged.

So here's my answer: we simply must concede the point. No, functional neuroimaging does not and cannot adjudicate between theories expressed solely at the psychological level that make no predictions about the brain. How could it? It was a trick question all along.

Tuesday, July 26, 2011

Exposing the Epipenomenon with TMS?

Nowadays it is proper for the neuroscientist to be slightly embarrassed and faintly apologetic about his or her fMRI researches. One should strike the same attitude about fMRI as one does about the British rock band Coldplay -- I know you like them, you know I like them, we all know we like Coldplay, but we're frankly a little bit ashamed about it. It is somehow tacky to like Coldplay.

There is a ready antidote for any researcher that wants to show everyone that he or she is not fooled by the easy listening schlock of fMRI, that he or she is a reluctant and skeptical user of that pretty little correlative toy. The ready antidote is transcranial magnetic stimulation -- TMS. TMS is tough. TMS is no-nonsense. TMS gets the job done -- and with no pictures. TMS tells you what is necessary, and it tells you about what is causal. And here's the kicker: TMS calls the bluff of fMRI, it holds fMRI to account, it cuts right through fMRI's bull. And TMS has nothing but contempt for Coldplay.

Here's how it often works with TMS. You've done a few fMRI studies and you keep showing that a region in the parietal lobe activates when subjects are engaged in a particular cognitive activity. It's not a one-off finding. You've used different paradigms and control conditions and this parietal area keeps popping up, it keeps stubbornly activating. You're a careful researcher. You've written the papers, you've presented the data at meetings, and your colleagues are impressed -- but not convinced. How do you know this region is "necessary" for the cognitive activity, they ask? How do you know it's not some sort of "epiphenomenon"? It's fMRI, after all.

At this point, you know what must be done. You know you have to zap your parietal area with TMS while subjects perform your paradigm. If you disrupt your parietal area with TMS and performance on the task suffers, then you've closed the loop. You've shown that your region is necessary, that it's causal, that it's for real, that it's not a stinking epiphenomenon. Case closed, The end.

That's the formula, that's the pattern, and that's how it usually unfolds. In other words, and somewhat paradoxically: TMS always has fMRI's back. TMS tells you what you thought you already knew. You never thought your parietal area was an "epiphenomenon", you thought it was doing something legitimately useful. Thanks for confirming for us what we already fervently believed, TMS!

But hang on. How come you never hear this story (this inferential pattern)?:

"After six fMRI studies implicating our parietal area in cognitive activity X, we performed a series of TMS studies designed to see whether our region was really necessary. The results showed no effect of TMS at the stimulation site during the critical task period, contrary to our expectations. We therefore conclude that activity in our parietal region is entirely epiphenomenal, and we begin our search anew for the neural basis of cognitive activity X".

Follow me here. If TMS is what we need to expose fMRI's empty and functionless epiphenomena for what they are -- technicolor gewgaws of no cognitive import -- then why are we never treated to hypothetical passages like the one quoted above? If we can't cite examples in the literature (perhaps such examples do exist, but I have just never come across them?), then it suggests that TMS, at least in the way it is being used, is not a really the no-nonsense tough guy we thought it was. Indeed, TMS is a rubber stamp for fMRI. You can can count on it never to expose an epiphenomenon. Don't worry, your epiphenomenon is safe with TMS.

Now, before you accuse me of being hard on TMS, let me say, I like TMS, I think it is a marvelous tool, I'm proud to call TMS a friend. But we need to consider why it rarely if ever exposes fMRI's ghoulish epiphenomena.

Reason #1. Exposing the epiphenomena involves proving the null. If zapping the parietal region has no effect on behavior, then the null hypothesis (e.g. epiphenomenon) is not rejected. The problem then is that conventional statistical inference is all about rejecting the null, and anything else is just a "negative finding". And negative findings are radically less likely to be published than positive ones. So, when it comes to exposing the epiphenomena, TMS has its hands tied. It couldn't expose an epiphenomenon even it wanted to -- it could only "fail to the reject the null hypothesis of an epiphenomenon". And who wants to do anything that sounds half as tedious as that?

Reason #2. One may speculate that fMRI epiphenomena (we're talking about replicable, reliable epiphenomena) are in point of fact pretty rare. In other words, if a region is reliably active during some cognitive paradigm, it's much more likely than not that the region is actually doing something functionally important, than that it is doing something useless and irrelevant to the task at hand. So when you zap the region with TMS, no surprise, performance tends to suffer.

Reason #3. Scientific orientation. Due to reason #1, you have to try pretty hard to expose an epiphenomenon with TMS. You have to be gunning for it. You're probably going to have to use unconventional Bayesian statistics to prove the null. How many researchers have undertaken a TMS study with the a priori hypothesis that the region that they were stimulating is only epiphenomenally involved in the cognitive activity required to perform the task at hand? Not many, I'd venture to say.

To sum up, I think TMS is a wonderful tool, and there is some first rate TMS research going on. The future is bright for TMS. But is it an fMRI "epiphenomenon-killer"? It doesn't seem to be. This is because killing an epiphenomenon requires proving the null; because epiphenomena maybe aren't that prevalent or at any rate clear-cut to begin with; and because if you really want to be an epiphenomenon hunter, you have to want it in earnest. You're gonna have to get all Bayesian if at the end of the day you want to be standing triumphantly atop a giant epiphenomenal carcass, with your TMS spear stuck deep in its epiphenomenal guts.

Wednesday, September 1, 2010

More work needed

Over at Talking Brains there is an ongoing debate about whether language researchers in cognitive neuroscience are making enough progress. Fedorenko and Kanwisher think that "more work is needed", and that the answer is functional localizers. And they want to help. It's all very magnanimous.

Surely, though, there was never a greater scientific truism than the statement "more work is needed"! According to Karl Popper, more work is always needed, forever, in perpetuity. The scientific process is an inherently Sisyphean enterprise. We're doomed to be forever wrong, with probability = 1.

But of course Fedorenko and Kanwisher are right: more work is needed in the cognitive neuroscience of language. Nevertheless, I welcome you to read their remarks, because it seemed to me that there was also the implication that in two particular areas of study, namely theory-of-mind and the ventral visual stream, there is perhaps a somewhat less urgent need for more work:

"Several regions in the ventral visual stream have been shown to be exquisitely specialized for processing visual stimuli of a particular class (see e.g., Kanwisher, 2010, PNAS, for a recent overview). Furthermore, Saxe and colleagues have shown that a region in the right temporo-parietal junction selectively responds to stimuli that require us to think about what another person is thinking (e.g., Saxe & Powell, 2006, Psych Sci, and many other papers; see the publications section on the SaxeLab’s website: "

Surely if there were anything that was almost as good as scientifically true, verified, incontrovertible, consensus-worthy, etc. -- it would be that the fusiform face area (FFA in the fusiform gyrus) and the theory-of-mind area (TPJ in the right temporo-parietal junction) are 100% functionally specialized. Here, try this out. Next time you're at a neuroscience conference, out to dinner with a diverse assortment of your cleverest colleagues, stand up and say loudly and earnestly that you know with as much certainty as you know anything in this world that the right TPJ functions to "let one know what others are thinking" and that it only functions to "let one know what others are thinking". Add that it has been "demonstrated beyond any doubt". Then finish off the thought by saying: "No more work needed. It's a wrap". I promise that everyone will nod in vigorous -- in violent and perhaps even hysterical -- agreement.

OK, one caveat. Make sure Professor Jason Mitchell is not in attendance. Because he authored a paper in 2008 published in Cerebral Cortex entitled:

Activity in the right temporo-parietal junction is not selective for theory-of-mind

He also wrote this (2009, Philos Trans Royal Society):

"Intriguingly, the same pattern of medial frontal, temporo-parietal and medial parietal activity consistently accompanies a number of disparate tasks that, at first blush, appear to share little in common with mentalizing. Most notably, these regions are engaged by attempts to prospectively imagine the future or to retrospectively remember the past (Addis et al. 2007; Buckner & Carroll 2007; Schacter et al. 2007; Spreng et al. in press). For example, Addis et al. asked participants alternately to imagine their future self experiencing a specific event (cued by an object, such as ‘dress’) or to recall an actual event that occurred in their past. Both prospection and episodic memory engaged a highly overlapping network of regions that included MPFC, bilateral TPJ and medial parietal cortex. In addition, the same network has also been argued to play a role in spatial navigation (Buckner & Carroll 2007; Spreng et al. in press)."

Could it be that the TPJ does a whole bunch of things, indeed -- that the TPJ is a veritable Johannes Factotum, a cognitive dilettante flitting about from task to task, engaged in all manner of cognitive processes, and lending its functional activation about with a cavalier and unsophisticated disregard for the tenets of domain specificity? It seems that there is considerable evidence that this may be the case.

More from Professor Mitchell, same 2009 review:

"The fact that prospection, episodic memory, spatial navigation and mentalizing each draws on the same set of brain regions suggests that each likewise draws on a common set of cognitive processes. What cognitive challenge might these four disparate tasks share? One answer to this question is that each requires perceivers to conjure up a world other than the one that they currently inhabit: prospection obliges perceivers to imagine possible future scenarios; episodic memory relies on the reconstruction of bygone events; and spatial navigation often includes simulations of possible routes between locations. In other words, prospection and episodic memory can be conceived of as forms of mental time travel, and spatial navigation as a form of mental teleportation, all of which depend critically on the ability to project oneself outside of the here and now, imagining times or locations other than the one currently being experienced (Suddendorf & Corballis 2007)."

OK, I should also warn that you might also want to make sure that Drs Grit Hein and Robert Knight are not in attendance when you announce that the TPJ is the theory-of-mind region and that the TPJ is only the theory-of-mind region. Because they wrote a paper entitled: "Superior Temporal Sulcus, It's my area or is it?" in the Journal of Cognitive Neuroscience (2008). Here's an interesting quote:

"Activity in the 'ToM regions' in posterior STS, intersecting the parietal lobe, also correlated with differential effects in attentional reorienting. In line with our findings, this argues against distinct functional subregions in the STS and adjacent cortices and is more in favor of the assumption that the same STS region can serve different cognitive functions, as a flexible component in networks with other brain regions. There is abundant evidence for this proposition from neuroanatomical studies revealing bidirectional connections of the STS region with a variety of brain structures, such as the ventral and medial frontal cortex, lateral prefrontal and premotor areas, the parietal cortex, and mesial temporal regions (Seltzer & Pandya, 1989a, 1994)."

In line with the network assumption, four of the five
studies in the ‘‘ ToM’’ category (Kobayashi et al., 2007; Voellm et al., 2006; Takahashi et al., 2004; Gallagher et al., 2000) report medial prefrontal activity together with STS activation, whereas STS activity in speech pro cessing was more accompanied by inferior frontal activation (Uppenkamp, Johnsrude, Norris, Marslen-Wilson, & Patterson, 2006; Rimol, Specht, Weis, Savoy, & Hugdahl, 2005). This might imply that the STS serves ToM when coactivated with medial prefrontal regions, while being
involved in speech processing when coactivated with the inferior frontal cortex."

Let me conclude by saying I'm a huge fan of cognitive neuroscience researchers advancing strong theories about the function(s) of a brain area. I have written previously that that the way forward is to embed our function terms in our structural ones -- to link them up. I have written with Mark D'Esposito a most thorough and plaintive hymn on the (quixotic?) quest for structure-function unity in phonological working memory ("The Search for the Phonological Store: From Loop to Convolution").

The "Fusiform Face Area" is a wonderful example of the merging of structure and function terms -- they are joined in a single moniker: FFA. Could the proposed link be wrong? Indeed it could. The function word "face" is just renting the space between "fusiform" and "area", it can always be cleaved, excised, extracted from that position if the evidence warrants. Until that time, however, more work is needed.

Sunday, August 23, 2009

fMRI is not an inherently correlational method

If you open up your favorite cognitive neuroscience textbook it's very likely that you'll find it stated somewhere that "fmri is a correlational method". Indeed, you'll read that this is one of its major drawbacks. On the other hand, transcranial magnetic stimulation (TMS), you'll be told, is a tool with which one can make honest-to-god causal inferences. FMRI = correlational; TMS = causal. That will be on the test. You can bank on it.

I don't really even need citations for this; it's conventional wisdom. I mean, everybody knows that fMRI is a correlational method. Of course it is! The notion that fMRI might not be a correlational method is simply too absurd to contemplate.

If I did not occasionally want to say something slightly outlandish, however, I would not bother maintaining this (biannually updated) blog.

So here it goes. I am going to say something slightly outlandish. Get ready for it.

"fMRI is not an inherently correlational method".

Having made such a highly unorthodox and possibly even dangerous claim, I should probably back it up with an argument. First we need to define our terms.

What is a Correlational Method?

A correlational method is one that examines the relationship between two measured variables over which the investigator has no experimental control. For instance, a study that examines the relationship between dietary cholesterol and heart disease is correlational. The experimenter exerts no control over either of the two variables. Correlational methods do not allow for causal inference. Just because we observe a correlation between dietary cholesterol and heart disease does not mean it can be concluded that one causes the other. Thus, as we learned in Statistics 101, correlation does not imply causation.

What Permits Causal Inference?

If we want to say something about causality, then we need to conduct a true experiment. Experiments allow the scientist to manipulate one variable (the independent variable) while measuring another variable(s) (the dependent variable) while holding everything else constant. If the experiment is properly controlled -- which is is no easy thing, of course -- then any observed change in the dependent measure that is correlated with the experimental manipulation of interest is assumed to have been caused by that manipulation. Thus, under certain special circumstances -- i.e. when an independent variable is manipulated and experimental control is assured -- correlation does indeed imply causation.

Does fMRI Permit Causal Inference?

Having defined our terms, let us now address the question we set out to answer, namely: is fMRI a correlational method? Well, I must admit that fMRI seems awfully correlational at first blush. I mean, you put someone in a scanner and he presses buttons and looks at pictures and wiggles his toes and dozes off probably for a full third of the experiment -- and meanwhile you're capturing these images every couple of seconds that you then submit to a fancy correlational analysis which spits out colorful activation maps.... I will grant that it seems correlational.

Here's why it's not, though. An fMRI experiment generally speaking involves an independent variable that is manipulated by the experimenter and a dependent variable that is measured by the machine. The independent variable might be, for instance, whether the subject is viewing a face or a house; and the dependent variable is the blood-oxygenation-level-dependent (BOLD) imaging signal. If everything else except for the particular experimental variable of interest (face or house) is held constant, then such an fMRI paradigm constitutes, by definition, a True Experiment, and therefore permits of causal inference.

Causal Inference of What?

Perhaps I've engaged in a bit of sophistry. Sure, fMRI allows for causal inference of a kind, but it does not allow one to infer anything about the sorts of things one is actually interested in! Well, lets think about what one can infer with fMRI. You can always say (assuming reliable statistics and proper experimental control) that your experimental manipulation caused the change in brain activation, wherever it is found. So in our simple face-house experiment if we see more activity in the fusiform gyrus while subjects viewed faces we are free to say that this was caused by our experimental manipulation. Ditto if more activation were observed, say, in the cerebellum.

Of course, often we are interested in more than the simple relationship between a task manipulation and brain activity; rather, we are interested in some theoretical entity -- a "cognitive process", if you will -- that we hope to observe in action during the performance of a task that was expressly designed with that entity in mind. Putting aside the obvious problem that your pet cognitive process is almost certainly a figment of your imagination, it is highly likely that even the most subtle task manipulation will reliably prod in to action a whole lot of cognitive processes in addition to that particular one you set out to manipulate. In other words, if you want to make inferences about cognitive processes, rather than task manipulations, you are going to have a very tough time of it. But this not a problem peculiar to fMRI. It's just as big a problem for reaction time studies and eye-movement studies and any other method in cognitive science, including TMS.


What about TMS, anyway. Why is it that TMS is so widely assumed to be a "causal" method and fMRI a correlational one? In fMRI we can make a causal inference from task manipulation to a difference in brain activation. In TMS we can make a causal inference from brain manipulation to a difference in some behavioral measure. It's an epistemological wash. Both methods allow for causal inference, both are useful, and the two are in a certain sense complementary. All the issues relating to inferring something about "cognitive processes" are equally as problematic for TMS as they are for fMRI.

But what about inferences about the "necessity" of a given region for a given "process"? Isn't this where TMS shines?

Not really, for the exact same reasons fMRI falls on its face here. If I apply TMS stimulation to a brain region and observe a behavioral effect, I can only say the stimulation to region X affected behavior Y. Suppose stimulating region X in turn stimulates region Y which in turns stimulates region Z which in turn disrupts a cognitive process A which in turn leads to impaired performance on task B? Was the stimulated region "necessary" for the performance of the task? No, it was not. It may have merely set off a chain of events that lead to the excitation or depression of region Z -- the unsung, unknown necessary region in the sordid affair -- which eventually gave rise to the behavioral effect. The same sort of reasoning can be applied to fMRI activations, which are equally susceptible to the problem of indirect effects. It's easy to control the experimental environment with a task or magnetic stimulation, but it's real hard to control the brain.


So, TMS and fMRI are on more or less equal footing when it comes to the question of inferring whether a brain region is "necessary" for a task or not. This is not to say that the two methods do not potentially offer differing or complementary or even convergent evidence in support of this or that hypothesis of interest. On the contrary, I think the combining of fMRI and TMS is a very powerful approach. But I think the claim that TMS is "causal" and fMRI is "correlational" is -- unless someone can convince me otherwise -- wrong.

Sunday, February 22, 2009

Voodoo Meta-Analysis

In my previous post I ran some simulations that explored how various summary scores of cluster-wise correlation magnitude are affected by cluster size. I showed that the peak correlation and the two-stage correlation yield systematically higher correlation estimates than either the median or minimum correlation in a cluster. I also made a statement about the magnitude of the bias in such summary scores that was based on a misunderstanding of the "non-independence" error as described by Vul et al.
in the "voodoo correlations" paper and in an in press book chapter by Vul and Kanwisher. I will return to my simulations and argue that they are indeed still informative, but first I want to discuss just what is meant by the "non-independence" error in neuroimaging, as defined by Vul and colleagues.

Understanding the "Non-Independence Error"

There is a common category of errors that often crop up in functional neuroimaging studies in which an ROI is selected on the basis of one statistical test and then a second non-independent test is carried out on the same data. This type of non-independence is often discussed in elementary neuroimaging tutorials and forums and is well-known and rather fiercely guarded against. It, moreover, involves two null hypothesis tests -- and therefore two statistical inferences, the second of which is biased to yield a result in favor of the experimenter's hypothesis. The category of errors referred to in Vul et al. subsumes, but is not limited to, these kind of "two hypothesis test" errors.

Consider the case of a whole-brain correlation analysis. One has just carried out an analysis correlating some behavioral measure x with with a measure of activity in every voxel in the brain. One has corrected for multiple comparisons and identified a number of "activated clusters" in the brain. So far so good. We have conducted one hypothesis test for each voxel in the brain. We are interested in finding where the significant clusters are located (if there are any at all) and we may also be interested in the magnitude of the correlations in the those active clusters.

If we have corrected for multiple comparisons, then we may safely report the location of the clusters in x,y,z coordinates. What we may not do, according to Vul and colleagues, is report the magnitude of the correlation. Neither may we report the maximum of the cluster. Nor may we report the minimum of the cluster. We may not chose any voxel in the cluster randomly and report its value. Let me go further, we may not substitute the threshold (t = 0.6, say) to serve as lower bound for the correlation magnitude. To report the magnitude of the correlation of a selected cluster, or any derivative measure thereof, is to commit the "non-independence error". [I note only in passing that if neuroimaging studies only ever reported the lower bound of a correlation (i.e. the threshold), no studies would ever report correlations greater than ~ 0.7].

One reason we may not (according to Vul et al.) report the magnitude of a correlation is because correlation estimates selected on the basis of a threshold t, will in on average be inflated relative to the "true" value. The reason for this is that above-threshold values are likely to have benefited from "favorable noise" and will therefore be biased upwards. The problem is akin to regression to the mean and is not specific to correlations or social neuroscience or even functional neuroimaging, per se. You can get an idea of the scope and generality of the concept in the recent chapter of Vul and Kanwisher -- which is an extended homily on the varieties of the "non-independence error" where you will, among other things, learn the virtue of not plotting your data:

The most common, most simple, and most innocuous instance of non-independence occurs when researchers simply plot (rather than test) the signal change in a set of voxels that were selected based on that same signal change.” (pg 5)

Vul and Kanwisher are also critical of several authors for presenting bar plots indicating the pattern of means in a series of ROIs selected on the basis of a statistical contrast. We are told that such a presentation is "redundant" and "statistically guaranteed"
(pg 11). I'll give you another example (this one I thought up all on my own) of Vul's non-independence error: reporting a correlation in the text of the results section and then also, quite redundantly and non-informatively, reporting the correlation separately in a table or figure legend. You see the range.

Before I begin with my critique of the Vul et al. meta-analysis, I just want to make it clear that the hypothesis that correlations in whole-brain analyses will tend to be inflated is quite reasonable. The other part of their hypothesis, that correlations are massively -- rather than, say, negligibly -- inflated needs to be backed up empirically. It is this aspect of their study -- the empirical part -- that I find unsatisfying (perhaps that is an understatement).

Voodoo Meta-Analysis and the Non-Independence Error

We have just remarked that in their in press chapter Vul and Kanwisher emphasize that the non-independence error is not limited to reporting of biased statistics, but may also involve the mere presentation of data that has been selected in a non-independent manner.

“Authors that show such graphs must usually recognize that it would be inappropriate to draw explicit conclusions from statistical tests on these data (as these tests are less common), but the graphs are presented regardless." (pg 8.)

Where else might we find example of such a misleading presentation of data? Sometimes it turns out that the "non-independence error" is lurking in your own backyard. Take for instance the meta-analysis presented in the "voodoo correlations" paper by Vul et al. (in press). The thesis of this paper is quite straightforward. First, the authors surmise that correlations observed in brain imaging studies of social neuroscience are "impossibly high". Second, because the magnitude of correlations are intrinsically important, scientists must also provide accurate estimates of the correlation magnitude -- something that is not necessarily guaranteed by null hypothesis testing alone.

To explore the question further, Vul et al. searched the literature for social neuroscience papers that reported brain-behavior correlations, and then sent a series of survey questions to the authors of the selected papers. On the basis of the authors' responses and other unknown considerations, they classified the papers as either using (good) independent methods or using (suspect) non-independent methods.

What constitutes a "non-independent" analysis, you ask? Studies classified as non-independent were ones that selected significant clusters and reported the magnitude of these activations based on a summary score (usually the mean or maximum value of the cluster). Let me be absolutely clear about this because there has been some confusion about this issue (I'm pointing at myself here). These studies did not perform two non-independent statistical tests. They performed one and only one correlation for every voxel in the brain. Because such analyses perform many correlations over the brain (tens of thousands), a correction for multiple comparisons is imposed, resulting in high statistical thresholds to achieve a nominal alpha value of 0.05. The key point is that in the Vul et al. meta-analysis, non-independent analyses are by and large synonymous with whole-brain analyses. That is a crucial element to the argument that follows, so take note of it.

An independent analysis, on the other hand, was defined as an analysis that used a region of interest (ROI) that was defined either anatomically, via independent functional ROI or localizer, or through some combination of both. As a consequence, such independent analyses usually only calculate just one or perhaps a handful of correlations, and therefore apply far more lenient statistical thresholds. For instance, in a large study with 37 subjects, such as the one by Matsuda and Komaki (2006, cited in Vul et al.), a correlation of 0.27 was declared statistically reliable at an alpha level of .05. A whole-brain analysis with the same number of subjects and a 0.001 alpha level would have required a correlation as least as great as 0.5 for a one-tailed test (in addition to whatever cluster extent threshold is applied). For a more typically sized 18 person study, the correlation would have had to be as large as 0.67 (one-tailed) to reach significance.

So What's Wrong with the Voodoo Meta-Analysis?

Let me count the ways.

Remember, the aim of the Vul et al. meta-analysis is to establish that non-independent methods produce massively -- not just marginally -- inflated correlations. The meta-analysis itself is fundamentally an empirical, rather than theoretical, endeavor. Let me remind you that studies classified as "non-independent" are all whole-brain analyses, and therefore involve corrections for multiple comparisons that necessitate a large correlation magnitude to achieve statistical significance. Those studies classified as independent do not impose such high thresholds. The upshot of this is that a whole-brain (non-independent) analysis will by definition never report a correlation less than about 0.5 (assuming a large 37 subject maximum sample). On the other hand, independent analyses, because of their greater sensitivity, will report correlations as low as 0.27 (assuming the same 37 subject maximum sample).

What does this tell us? The classification of papers in to "non-independent" and "independent" groups was guaranteed to produce higher correlations on average for the former than for the latter group, irrespective of whatever genuine inflation of correlation magnitudes may exist in the latter category.

The same result could have been produced with a random number simulation. Suppose I sample numbers randomly from the range -1 to 1. In a first run I sample a number and check to see if it's greater than 0.3, and store it in an array. I keep doing this until I've got about 25 values. In a second run I sample numbers from the same underlying distribution, but I only accept a number greater than 0.6. I then plot a histogram, showing how the first group of numbers are shifted to the left (plotted in green) of the second group of numbers (plotted in red). Note that I'll have to sample more numbers in the latter case to get to 30, but that's OK as I have an inexhaustible supply of random numbers to draw from. Compare this to the Vul et al. literature search which found approximately equal (30 and 26, respectively) numbers of independent and non-independent analyses even though the relative frequencies of the two classes may be very different. But Pubmed, like a random number generator, is inexhaustible.

There is a counterargument that Vul et al. might avail themselves of, however. They might argue that high thresholds and inflated correlations are inextricably linked. It is the high thresholds that lead to the inflated correlations in the first place. Unfortunately, the argument holds little water, as high thresholds would lead to high (significant) correlations even in the absence of any correlation "inflation", which happens to be consistent with the null hypothesis that the authors wish to reject (or persuade you to reject, as we shall see in the next section). Moreover, this argument, if seriously offered, would be a rather obvious example of "begging the question", a practice the authors strongly repudiate. Finally, the division of studies into the two groups is confounded by the differing sensitivities of the analyses, with non-independent studies sensitive only to larger magnitude correlations.

Voodoo Histogram

I would like now to everyone to turn to page 14 of the "voodoo correlations" paper where you may get acquainted with the most famous histogram of 2009, the Christmas colored wonder showing the the distribution of correlations among "independent" and "non-independent" studies that entered Vul et al. survey. What is the purpose of this histogram? Before we answer that question, let us return to the central theses of Vul et al. First, correlation magnitudes matter. And, second, that non-independent analyses produce grossly inflated correlations.

What evidence, other than a priori reasoning, do they adduce in favor of the inflation hypothesis? Well, as a starter, do they provide summary statistics, i.e. the mean or median correlation in the two groups? No. Do they perform a statistical test comparing the the two samples for a shift in the central tendency using, for instance, a t-test or a non-parametric test of some kind? No. Do they carry out an analysis of the frequencies distributed over bins with a chi-square or equivalent statistical test? No. Finally, if correlation magnitudes matter, why does it appear that the authors make an exception to that rule in the their own analysis which fails to report an estimate of the difference in correlations between the two groups? After all, how are we to know how serious the error is, if there is one at all? Do we care if the bias in correlation magnitude is .001 or .05 or even .1? Probably not very much.

Now, the reason for the omission of any statistical test or summaries, I think, is that Vul et al., being virtuous abstainers of the "non-independence error", believed they could avoid its commission by eschewing a formal test -- and therefore insulate themselves against the charge of non-independence. Instead, they reasoned, "we'll just present a green/red colored histogram and let the human color perception system work its magic". (Sadly, since the authors used red and green squares in their histogram, color blind social neuroscientists are mystified as to what all the fuss is about). Sometimes, however, it is enough to plot one's data to be accused guilty of the "non-independence error".

Let me remind you of a passage from Vul and Kanwisher (in press) which contains more of the wit and wisdom of Ed Vul.

"Authors that show such [non-independent] graphs must usually recognize that it would be inappropriate to draw explicit conclusions from statistical tests on these data (as these tests are less common), but the graphs are presented regardless. Unfortunately, the non-independence of these graphs is usually not explicitly noted, and often not noticed, so the reader is often not warned that the graphs should carry little inferential weight." (Vul and Kanwisher, in press, pg. 8)

I think that quote is a rather a nice summing up of the sad affair of the "voodoo histogram". The thing was based on non-independent data selection (due to the differing thresholds between the two groups and sundry other reasons described below) but was nevertheless used to persuade the reader of the correctness of the authors' main hypothesis. In the end, we do not know what to conclude from this meta-analysis, having been presented with no evidence in favor of the central hypotheses put forth by the authors. That the evidence was selected in a non-independent manner in the first place, due to the disparity in the statistical thresholds across groups, has a strange self-referential quality to it that reminds me of one of those Russellian paradoxes about barbers or Cretans and so on.

Cataloguing some of the Voodoo

The more I look at Vul and colleagues' meta-analysis the more perfect little pearls of "non-independence" turn up in its soft tissue. In the following sections I am simply going give you a taste.

1) Vul et al. classified studies as "independent" that selected voxels based on a functional localizer and then correlated a behavioral measure with data extracted from that ROI. The majority of such studies identified the ROI used for the secondary correlation analysis with a whole-brain t-test conducted at the group level in normalized space. It so happens that the magnitude of a t-statistic is influenced by both the difference of two sample means (or the difference between a sample mean and a constant) and the variance of the sample. Thus, ROIs identified in this manner will have taken advantage of favorable noise that will insure both large effects and small variance. As Lieberman et al. cleverly point out in their rebuttal to "voodoo correlations", low variance will inevitably lead to range restriction, a phenomenon that has the effect of artificially deflating correlations. Therefore, the studies labelled "independent" that used whole-brain t-tests to identify ROIs (the majority of such studies) were virtually guaranteed to produce reduced correlations, and therefore constitute another example of the "non-independence error" unwittingly perpetrated by the Vul et al. meta-analysis.

2) The meta-analysis fails to identify which studies reported the peak magnitude of a cluster and which studies reported mean correlations. Vul et al. repeatedly insist that "correlation magnitudes matter" and if this is the case it would be important to distinguish between those two sets. You may refer to my previous blog entry to see that average measures of correlation magnitude in a cluster hew towards the threshold, which is generally around .6 or .65 for a whole-brain analysis. On the other hand, "peak" values in a cluster are (by definition) not representative of the regional magnitude of the correlation estimate and, moreover, suffer from the problem of regression to the mean. It is very likely that virtually all the correlation estimates exceeding .8 come from studies that used the peak magnitude of the cluster. This is important to know! Remember, Vul et al.'s argument isn't to say that reporting only peak magnitudes is a bad practice, it's to say that reporting any summary measure in a selected cluster will result in massively inflated correlations. No evidence for that assertion is provided in the meta-analysis and critical information as to which summary measure was used for each study is not reported.

3) Localization of an ROI using an independent contrast is an imperfect process. Just as there is noise in the estimation of correlation magnitudes so too is there noise in the estimation of the "true location" of a functional area. Thus, spatial variation in the locus of a functional ROI insures that a subsequent estimate of correlation magnitude will be systematically biased downwards. It would take many repetitions of the same experiment in the same group of subjects to arrive at a sufficiently accurate estimate of the "true location" (insofar as such a thing exists) of a functional region to mitigate this spatial selection error.

4) Vul et al. do not consider the possibility that exploratory whole-brain correlation analyes are much more likely to find genuine large magnitude correlations than hypothesis-driven ROI analyses. Think about it. An ROI analysis is about confirming a specific hypothesis that an experimenter has about a brain-behavior relationship. It's a one-shot deal. If the researcher is wrong, he comes up empty. Whether a significant correlation is observed depends a lot on whether the scientist was right to look in that particular region in the first place. But remember, the brain is still a mysterious object and sometimes neuroscientists aren't quite sure where exactly to look. It is in these cases, generally, that they turn to whole-brain analyses. Consider the following. Suppose I have an hypothesis about the relationship between monetary greed and and brain activity in the orbitofrontal cortex. I define an ROI and perform a correlation between brain activity and some measure of the behavior of interest (greed). The correlation turns out to be 0.6. Now, what is probability that some other region in the brain has a higher correlation than the one I discovered? Well, since we know relatively little about the brain I submit that the probability is very close to 1. If that's the case, for every ROI analysis that discovered a correlation with magnitude r a corresponding whole-brain analysis, due to its exploratory nature, is likely to find those other regions that correlate more strongly with the behavioral measure than the ROI that was chosen on the basis of imperfect knowledge. The bottom line is that exploratory analyses have the opportunity, because they are exhaustive, to uncover the big correlations, while targeted ROI analyses are fundamentally limited by the experimenter's knowledge of the brain and the constraint of looking only in one location.

There are two parties looking to find a buried treasure. One party has a hazy hunch that the treasure in located in spot X. The other party, which is bank-rolled by the royal family, is composed of thousands of men who fan out all over the area, searching for the treasure in every conceivable place, leaving no stone unturned. If the first party's hunch is wrong, they strike out. The second party, however, by performing an exhaustive search, always succeeds in finding the treasure -- provided it exists.

5) A few more things to chew on. Vul et al.'s study included five Science and Nature studies combined, which accounted for 10% of the all studies (which means that these "big two" journals were vastly over-represented). Of those 5 papers, 13 out of the 14 correlations included were in the non-independent group. Second, of the the 135 correlations in the non-independent group, 22 came from a single study (study 11, see Vul et al. appendix). Of the 55 correlations that were greater than .7 in the non-independent group, a whopping 23% (13/55) came from this same study. The mean number of correlations from each study was 4.9 and the standard deviation was 4.4, meaning that 22 is nearly 4 standard deviations outside the mean and is therefore an outlier by anyone's standard. Remember that correlations drawn from the same study are non-independent and therefore including 22 correlations from a single study -- especially when that study contributed a disproportionate amount of correlations greater than 0.7, is rather dubious. Indeed, to avoid a non-independence error in this case, Vul et al. should have only chosen one correlation from each study -- and have chosen that correlation randomly.

6) The variance of an correlation estimate is related to the sample size of the study. And yet Vul et al. fail to report the sample size of the 54 studies that entered their analysis. This is a serious omission for obvious reasons that Vul et al. should have been attuned to. This is another potential commission of the non-independence error is Vul et al.

7) I'm getting tired so I will be brief on this last one. The selection criteria for the papers that entered the meta-analysis are poorly described. For instance, how did 5 Science and Nature studies get in to the sample? Is that an accident? If not, what was the rationale for choosing all those high profile papers? A meta-analysis should either be exhaustive or otherwise take pains to achieve a representative sample -- and, if the latter, then it is incumbent on the authors to describe the selection criteria and methods in detail. For instance, were the persons who selected the papers blind to the hypothesis? And so on.


Vul, E. Harris, C. Winielman, P. Pashler, H. Voodoo Correlations in Social Neuroscience. Perspectives in Psychological Science. In Press.

Vul E. and Kanwisher N. Begging the Question: The Non-Independence Error in fMRI Data Analysis. Book Chapter. In Press.

Lieberman, M. Berkman, E. Wager, T. Correlation in Social Neuroscience Aren't Voodoo: A Reply to Vul et al. Perspectives in Psychological Science. In Press.

Tuesday, February 17, 2009

Simulating Voodoo Correlations: How much voodoo, exactly, are we dealing with?

The recent article "Voodoo Correlations in Social Neuroscience" by Ed Vul and colleagues has gotten a lot of attention and has stimulated a great deal of discussion about statistical practices in functional neuroimaging. The main critique in the article by Vul involves a bias incurred when a correlation coefficient is re-computed by averaging over a cluster of active voxels that are selected from a whole-brain correlation analysis. Vul et al. correctly point out that the method will produce inflated estimates of the correlation magnitude. There have been several excellent replies to the original paper, including a detailed statistical rebuttal showing that the actual bias incurred by the two-stage correlation (henceforth: vul-correlation) is rather modest.

It occurred to me in thinking about this problem that the bias in the correlation magnitude should be related to the number voxels included in the selected cluster. For instance, in the case of a 1 voxel cluster the bias is obviously zero since there is only one voxel to average over. How fast does this bias increase as a function of cluster volume in a typical fMRI data set with a typically complex spatial covariance structure? Consideration of the high correlation among voxels within a cluster led me to wonder about the true extent of bias in vul-correlations. For instance, in the most extreme case, where all voxels in a cluster are perfectly correlated, there is zero inflation due to avergaing over voxels.

To explore these questions I ran some simulations with real world data. The data I used were from a study carried out on the old 4 Tesla magnet at UC Berkeley and consisted of a set of 27 spatially normalized and smoothed (7mm FWHM) contrasts in a verbal working memory experiment (delay period activation > baseline). The goal was to run many correlation analyses between the "real" contrast maps and a succession of randomly generated "behavioral scores". Thus, for each of 1000 iterations I sampled 27 values from a random normal distribution to create a set of random behavioral scores. I then computed the voxel-wise correlation between each set of scores with the set of 27 contrast maps. I then thresholded the resulting correlation maps at 0.6 (p = 0.001) and clustered the above-threshold voxels using FSL's "cluster" command. This resulted in 1000 thresholded (and clustered) statistical maps representing the correlation between a set of "real" contrast maps and 1000 randomly generated "behavioral scores".

Next, I loaded each of the 1000 statistical volumes and computed, for each active cluster, the minimum correlation in the cluster, the median correlation in the cluster, the maximum correlation in the cluster, and the two-stage vul-correlation. The vul-correlation was computed as follows: I extracted the matrix of values from the set of contrast maps for each cluster where (rows=number of subjects(27), columns=number of voxels in cluster) and averaged across columns, yielding a new vector of 27 values. I then recomputed the correlation coefficient between this averaged vector and the original randomly generated "behavioral variable" (all 1000 of which had been saved in a text file). Then I plotted cluster volume in cubic centimeters against its median, maximum, and vul-correlations. Here's the result.

What you can see is that vul-correlation rapidly increases as a function of cluster volume, reaching asymptote at a correlation of about .73 and a cluster volume of roughly 2 cubic centimeters. You can see, however, that the maximum correlation, which is not a two-stage correlation, has almost the exact same functional profile. The median correlation within a cluster also increases somewhat, but not as high or as rapidly as the vul- and maximum- correlations.

To quantify the "bias" in the vul-correlation as a function of cluster size I plotted the difference between the vul-correlation and median correlation.

It is clear from this plot that the bias becomes maximal when the cluster size is approximately 3 cubic centimeters. That is, however, rather a large cluster by fMRI standards. For a 1 cubic centimeter cluster the bias is about .075 and for a 1/2 cubic centimeter cluster (approximately 20 3 x 3 x 3 mm voxels) the bias is about 0.06. I'm not sure whether that rises to the level of "voodoo". Perhaps voodoo of a Gilligan's Island variety. Minor voodoo, if you like.

Lastly, I examined the minimum correlation as a function of cluster size. Of course, the minimum correlation can never fall below the cluster threshold, which was .6. Thus, I thought that the minimum correlation might serve as a good lower bound for reporting correlation magnitudes. You can see from the plot below that for these random simulations, at least, the minimum correlation does not increase with cluster size. In fact, it tends to approach the correlation threshold, which is not surprising, as this is what would be expected in a noise distribution. This time I've plotted cluster volume on a log (base 2) scale for easier visualization of the trend.

So, what have I learned from this exercise? First, the amount of inflation incurred from a two-stage correlation (vul-corrrelation) increases as a function of cluster size. For smallish clusters (1/2 to 1 cubic centimeters) this bias is not that much, whereas for larger clusters the bias is as high as 0.1. Second, the maximum correlation has a nearly identical relation with cluster volume as does the vul-correlation. Finally, candidates for the reporting of cluster magnitudes could be the median or minimum correlations. The median correlation increases with cluster size, but not by much. The minimum correlation decreases with cluster size, but again not by much.

All in all, I think the problem identified by Vul et al. is a genuine one. Two-stage correlation estimates are inflated when compared to the median correlation within the cluster -- but not by that much. One reason for this is the high threshold required to achieve significance in whole-brain analyses yield voxels that don't have much room to go up. In addition, the constiuent voxels of a cluster are already highly correlated, so that the "truncation of the noise distribution" referred to by Vul et al. may be less than would be expected among truly independent voxels. So, perhaps, in the end the vul-correlation isn't so much a voodoo correlation as it is a vehicle for voodoo fame.

Thursday, May 8, 2008

"The Neural Data is More Sensitive than the Behavioral Data"

Before I get back to the "Four Ages of Functional Neuroimaging" I'd like to take a brief detour and talk a little bit about a phrase -- or a slogan, perhaps -- that one hears more and more in the neurosciences, namely, that: "the neural data is more sensitive than the behavioral data".

I'll give a little context. A speaker has just presented some data, say, on the relationship between hippocampal volume and a genetic polymorphism, or the effect of some drug on dopaminergic activity in the striatum. Impressive bar graphs are displayed, with big effects and little error bars. There is no doubt that the the finding is Real, that such and such drug or such and such genetic polymorphism is having a measurable biological impact, and that it's interesting and worth studying, etc., etc.

Sometimes these biological data are presented along side lots of "scatterplots" showing that the effects are also correlated with some behavioral measure, say, working memory capacity, or performance on the Wisconsin Card Sorting Task. If you have a biological finding and a scatterplot showing a relation to behavior, then you're golden. Everybody in the room is happy, even the cranky behavioral psychologist in the back.

But what if the speaker just presents the biological measure without the scatterplot, without the link to behavior? This is usually fine, provided no claims are made about behavioral relevance. Sometimes it really isn't that important to link the two. One is just trying to get a handle on the relationship between two neural variables (say gene X, and hippocampal volume) and no strong claims are made about causal links to some behavioral state. Someone else will figure that out, later. Sometimes, however, the speaker wants to make these strong claims, even without the scatterplot. Of course, the speaker would have liked to show the audience a nice brain-behavior correlation, and he or she almost certainly collected some behavioral index, but as occurs in science from time to time, the correlation failed to reach significance. And, thus, no scatterplot.

The talk concludes, the speaker having argued forcefully for the importance of drug X, because of its effect on brain system Y. The speaker goes on to say that the drug allows subject to focus attention better and enhances working memory and general fluid cognition.

Hand goes up in the back -- it's the cranky behavioral psychologist. He has a kind of a gravelly voice and one has the distinct impression that he was asleep for most of the talk. Here's what he asks: "Did you measure any behavioral variables? Did administration of the drug have any effect on cognition, as measured by standard measures or memory, reaction time, etc?"

The speaker is ready for this. He is indeed smiling. He's been handling this question for years, and frankly, he's rather amused at the naivete of the questioner.

"Well", he or she says, "of course we had our subjects perform a whole battery of neuropsychological tests, cognitive tasks, and personality inventories, including the WCST, N-Back, Trails A, B, C, and D, the Simon task, the TPQ, the Sensation Seeking scale, the impulsivity scale, locus of control, etc. etc. but none of these measures were significantly correlated with our biological finding. Of course, this is no surprise, because as everybody knows the neural data is more sensitive than behavioral data." The cranky psychologist offers a slight grimace, but does not follow up with another question. Once again, the response worked its charm. After all, who is to argue? There was a big neural effect and no behavioral effect -- therefore, surely the neural data is indeed more sensitive than the behavioral data. Right?

But wait, one might ask what is the neural data more sensitive to?. That is surely an important question. Let's think. The neural data is more sensitive to neural differences (e.g. hippocampal volume) than the behavioral data is. That is true -- perfectly trivial but perfectly true. The converse is also -- trivially -- true: "The behavioral data is more sensitive to behavioral differences than the neural data is".

A more interesting statement would be as follows: "The neural data is more sensitive to behavioral differences than the behavioral data is". That would be a strong claim, but one that is rarely made. Instead, we get the stock "neural data is more sensitive than behavioral data" without any context or qualification. The problem is that this phrase, this slogan, this stock reply to to the crabby behavioral psychologist, is empty of content and specificity.

Just to drive the point home, what if I told you that a stethoscope is more sensitive to differences in heart rate then any behavioral measure. Would you be surprised? But what if I went on to say that my heart rate measurements, because they are so sensitive, indicate that subjects with a faster heart rate live longer. But wait, asks the old guy in the back of the room, where is your behavioral evidence for that assertion (e.g. measure of longevity)?. Don't need any, because biological measures are more sensitive than behavioral measures.

Monday, August 13, 2007

The Four Ages of Functional Neuroimaging: Part 3

Rather than treat “cognition” as a separate realm where functions are described and diagrammed on sheets of paper, functional neuroimaging seeks to eliminate the mind-brain barrier, to deny that venerable dichotomy, and to shift the terrain from the ether of psychological abstraction to the material folds of the brain. It does not matter in the long run that the fusiform gyrus might not truly act as a unitary and modular processor of faces that the name (“FFA”) implies; rather, what is important is that this particular assertion about face processing is committed to a neural state of affairs, which is open both to empirical support or falsification. It is a hypothesis which bears itself to all, declining to hide from the protective shade provided by the term “neural correlate”. The FFA is not the neural correlate of the face processor – it is the face processor, that is its function. Thus, one might say that the end of the Silver Age of neuroimaging was characterized by an increasing willingness, bolstered by an accumulation of empirical support, to propose hypotheses about brain function that treated “cognition” as a thing to be described in neural terms, howsoever simplistic and inchoate, and to be informed by data derived from cognitive psychology, neuroscience, and functional neuroimaging itself.

The use of functional neuroimaging techniques in the study of the biological basis of human cognition and behavior is now entering a Golden Age. The necessary, but often atheoretical, project of “mapping” hypothesized cognitive functions onto discrete pieces of cortex is coming to an end. Rather than stating hypotheses in terms of models of cognition and then in effect “searching” for brain correlates (or proxies of cognitive components) many researchers are now taking an integrated approach, where hypotheses about functional anatomy are stated a priori, and imaging results are taken as evidence for or against a stated hypothesis. One no longer asks where a function is located but rather whether an hypothesized functional-anatomical correspondence provides an accurate picture of biological reality. In the areas where functional neuroimaging is having its greatest impact, it has managed to engage the interest of cognitive psychology and neuroscience. For instance, in long-term memory research, behavioral neuroscience, traditional cognitive psychology, and cognitive neuroscience researchers are increasingly involved in a unified pursuit, a joint conversation, centering on the role of the medial temporal lobe in memory. For example, the idea from psychology of a dichotomy between recollection and familiarity in long-term memory is being studied at all levels: in rats with implanted electrodes, with behavioral measures in psychological laboratories, and with human neuroimaging studies. Whereas in prior years, neuroscience and psychology would be carrying out studies in isolation, each with its own idiosyncratic paradigms and nomenclature, the arrival of cognitive neuroscience and human neuroimaging, is increasingly providing the bridge between psychological and brain research, and contributing to the emergence of a unified approach to a particular problem domain.

To give a more specific example of how the field is advancing consider the hypothesis that the function of the inferior frontal gyrus is for the “selection of competing alternatives” in the context of word retrieval (Thompson Schill et al. 1997). This is a classic “Silver Age” function-structure proposition. It identifies a fairly large region of cortex with a particular function, without an overarching model of word retrieval or a specification contextual factors or neural interactions. On the other hand, this hypothesis makes a rather stark and forthright claim about the function of a particular brain region, and has led to a small industry on the role of the inferior frontal gyrus in semantic retrieval. Competing hypotheses have been adduced which have highlighted the distinction between “controlled retrieval” and “selection from competing alternatives”, and has led the development of paradigms specifically designed to arbitrate between these two notions. The neuroanatomical specificity of these hypotheses has also vastly improved with newer theories proposing functional subdivisions within the inferior frontal gyrus that has led to a more nuanced understanding of the function of the region in word retriaval. Interactions with posterior temporal cortex have also been explored and a links between ideas deriving from models of word retrieval such as that of Levelt, have prominently entered the discussion. Thus, what began as an assertion about the functional role of a brain region has led by degrees to a far more sophisticated appreciation of the role of ventrolateral frontal lobe structures in word retrieval.

One might ask what possible bearing does any of this have on cognitive psychology, traditionally conceived as the study of the function of the mind? The answer to this question is, to paraphrase Coltheart, “none”. Physics has nothing to say about meta-physics. Likewise, the discussion of the functions of brain regions and their interactions tells us nothing about the mind – so long as one insists that the mind exists in a realm apart from the mundane exertions of the brain. Thus, the ongoing debates about the hippocampus and the inferior frontal gyrus are not about “mapping” from mind to brain, as it used to be. Rather, structure and function are inseparably linked, and the common practice now is to examine these two facets of brain organization as an integrated whole, just as the mechanic considers the radiator as a thing that serves a particular purpose, without bothering with intermediary “car-minds” and “car-mind processes” and other such ornaments of functionalism.