<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Ioannidis on effect size inflation, with guest appearance by Bozo the Clown</title>
	<atom:link href="http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/</link>
	<description>...or you get no soup for one year!</description>
	<lastBuildDate>Fri, 10 Feb 2012 12:37:32 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: [citation needed]&#187; Blog Archive &#187; the &#8216;decline effect&#8217; doesn&#8217;t work that way</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-2518</link>
		<dc:creator>[citation needed]&#187; Blog Archive &#187; the &#8216;decline effect&#8217; doesn&#8217;t work that way</dc:creator>
		<pubDate>Thu, 16 Dec 2010 02:25:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-2518</guid>
		<description>[...] Atlantic article), whose work I can&#8217;t say enough good things about (though I&#8217;ve tried). But lots of other people have had a hand in popularizing the same or similar ideas&#8211;many of [...]</description>
		<content:encoded><![CDATA[<p>[...] Atlantic article), whose work I can&#8217;t say enough good things about (though I&#8217;ve tried). But lots of other people have had a hand in popularizing the same or similar ideas&#8211;many of [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: [citation needed]&#187; Blog Archive &#187; fourteen questions about selection bias, circularity, nonindependence, etc.</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-1477</link>
		<dc:creator>[citation needed]&#187; Blog Archive &#187; fourteen questions about selection bias, circularity, nonindependence, etc.</dc:creator>
		<pubDate>Mon, 28 Jun 2010 05:19:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-1477</guid>
		<description>[...] The only thing I&#8217;d add to this is that the bias in effect size estimates is not only common, but, in most cases, is probably very large. [...]</description>
		<content:encoded><![CDATA[<p>[...] The only thing I&#8217;d add to this is that the bias in effect size estimates is not only common, but, in most cases, is probably very large. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: [citation needed]&#187; Blog Archive &#187; fMRI becomes big, big science</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-473</link>
		<dc:creator>[citation needed]&#187; Blog Archive &#187; fMRI becomes big, big science</dc:creator>
		<pubDate>Sat, 13 Mar 2010 02:21:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-473</guid>
		<description>[...] are almost certainly massively inflated by small sample size (as I&#8217;ve discussed before here and in this [...]</description>
		<content:encoded><![CDATA[<p>[...] are almost certainly massively inflated by small sample size (as I&#8217;ve discussed before here and in this [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: [citation needed]&#187; Blog Archive &#187; specificity statistics for ROI analyses: a simple proposal</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-56</link>
		<dc:creator>[citation needed]&#187; Blog Archive &#187; specificity statistics for ROI analyses: a simple proposal</dc:creator>
		<pubDate>Mon, 14 Dec 2009 06:36:50 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-56</guid>
		<description>[...] the problem that approach raises&#8211;which I&#8217;ve discussed in more detail here&#8211;is the familiar one of multiple comparisons: If you&#8217;re going to test 100,000 locations, [...]</description>
		<content:encoded><![CDATA[<p>[...] the problem that approach raises&#8211;which I&#8217;ve discussed in more detail here&#8211;is the familiar one of multiple comparisons: If you&#8217;re going to test 100,000 locations, [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blog Review: Citation Needed &#171; The Amazing World of Psychiatry: A Psychiatry Blog</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-49</link>
		<dc:creator>Blog Review: Citation Needed &#171; The Amazing World of Psychiatry: A Psychiatry Blog</dc:creator>
		<pubDate>Sat, 12 Dec 2009 09:46:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-49</guid>
		<description>[...] acquire large datasets for use in research and this is certainly a very interesting idea. This is a very nice post and appeals to me because i&#8217;ve spent a bit of time looking into the original article and the [...]</description>
		<content:encoded><![CDATA[<p>[...] acquire large datasets for use in research and this is certainly a very interesting idea. This is a very nice post and appeals to me because i&#8217;ve spent a bit of time looking into the original article and the [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Neuroskeptic</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-24</link>
		<dc:creator>Neuroskeptic</dc:creator>
		<pubDate>Thu, 26 Nov 2009 12:43:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-24</guid>
		<description>&lt;i&gt;&quot;The Journal of Cognitive Neuroscience already experimented with something like this back when they required authors to submit their raw data along with manuscripts.&quot;&lt;/i&gt;

Ah, I wasn&#039;t aware of that. But that&#039;s not quite what I was suggesting. &lt;i&gt;raw&lt;/i&gt; data is not very interesting especially when it&#039;s fMRI data which will take you at a minimum hours to analyze.

I&#039;d want the &lt;i&gt;final&lt;/i&gt; analyzed statistical parametric maps made available. And ideally all of the intermediate steps which the data went through on its journal from raw data to final results.

My ultimate fantasy solution would be to have one massive supercomputer where everyone in the world uploads their data and does all their analysis, and everyone can see what everyone else is doing and everything they&#039;ve ever done. This is not going to happen... but I think this is the ideal - total openness about what we do. Because frankly as scientists (and especially as scientists paid by the taxpayer!) we should be completely open about what we do, there shouldn&#039;t be any &quot;file drawers&quot; at all.</description>
		<content:encoded><![CDATA[<p><i>&#8220;The Journal of Cognitive Neuroscience already experimented with something like this back when they required authors to submit their raw data along with manuscripts.&#8221;</i></p>
<p>Ah, I wasn&#8217;t aware of that. But that&#8217;s not quite what I was suggesting. <i>raw</i> data is not very interesting especially when it&#8217;s fMRI data which will take you at a minimum hours to analyze.</p>
<p>I&#8217;d want the <i>final</i> analyzed statistical parametric maps made available. And ideally all of the intermediate steps which the data went through on its journal from raw data to final results.</p>
<p>My ultimate fantasy solution would be to have one massive supercomputer where everyone in the world uploads their data and does all their analysis, and everyone can see what everyone else is doing and everything they&#8217;ve ever done. This is not going to happen&#8230; but I think this is the ideal &#8211; total openness about what we do. Because frankly as scientists (and especially as scientists paid by the taxpayer!) we should be completely open about what we do, there shouldn&#8217;t be any &#8220;file drawers&#8221; at all.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tal</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-18</link>
		<dc:creator>tal</dc:creator>
		<pubDate>Mon, 23 Nov 2009 15:53:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-18</guid>
		<description>Hi Ed,

Thanks for the comment. I think cross-validation and independent localization are generally a good idea, but I&#039;m not sure they help much with inflation caused by sampling error (as opposed to measurement error). Actually, there&#039;s two issues. One is that power takes a hit any time you reduce the data. So in the case of individual differences in particular, I think it&#039;s usually a bad idea to take a sample of, say, 20 subjects and split it in half, because (a) power in one half of the sample is going to be so low you won&#039;t detect anything and (b) uncertainty in the other half is going to be so great you won&#039;t be able to replicate anything anyway. But if you have a sample of, say, 500 people, and you&#039;re not dealing with tiny effects, then sure, this is a good way to go (and is something I do when I have behavioral samples that big to work with).

Doing cross-validation or independent localization&lt;em&gt; within&lt;/em&gt; subjects is less of an issue from a power perspective. The problem here though is that you don&#039;t necessarily reduce inflation that&#039;s caused by sampling error. You can allow that you&#039;re measuring each individual&#039;s true score with perfect reliability and still have massive inflation caused by low power/selection bias. And cross-validation really won&#039;t help you in that case. You could do even/odd runs or localizers or leave-one-out analyses, and the fundamental problem remains that people&#039;s scores could be more or less the same across all permutations of the data and still grossly unrepresentative of the population distribution. That doesn&#039;t mean we shouldn&#039;t do these types of analyses, since they do nicely control for other types of error; but I&#039;m not at all sure they do much to solve the particular problem I&#039;m talking about.</description>
		<content:encoded><![CDATA[<p>Hi Ed,</p>
<p>Thanks for the comment. I think cross-validation and independent localization are generally a good idea, but I&#8217;m not sure they help much with inflation caused by sampling error (as opposed to measurement error). Actually, there&#8217;s two issues. One is that power takes a hit any time you reduce the data. So in the case of individual differences in particular, I think it&#8217;s usually a bad idea to take a sample of, say, 20 subjects and split it in half, because (a) power in one half of the sample is going to be so low you won&#8217;t detect anything and (b) uncertainty in the other half is going to be so great you won&#8217;t be able to replicate anything anyway. But if you have a sample of, say, 500 people, and you&#8217;re not dealing with tiny effects, then sure, this is a good way to go (and is something I do when I have behavioral samples that big to work with).</p>
<p>Doing cross-validation or independent localization<em> within</em> subjects is less of an issue from a power perspective. The problem here though is that you don&#8217;t necessarily reduce inflation that&#8217;s caused by sampling error. You can allow that you&#8217;re measuring each individual&#8217;s true score with perfect reliability and still have massive inflation caused by low power/selection bias. And cross-validation really won&#8217;t help you in that case. You could do even/odd runs or localizers or leave-one-out analyses, and the fundamental problem remains that people&#8217;s scores could be more or less the same across all permutations of the data and still grossly unrepresentative of the population distribution. That doesn&#8217;t mean we shouldn&#8217;t do these types of analyses, since they do nicely control for other types of error; but I&#8217;m not at all sure they do much to solve the particular problem I&#8217;m talking about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ed vul</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-17</link>
		<dc:creator>ed vul</dc:creator>
		<pubDate>Mon, 23 Nov 2009 15:13:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-17</guid>
		<description>Hi Tal,

That&#039;s a nice write-up, but I&#039;m wondering why you think there isn&#039;t a practical &quot;solution&quot;?  A whole-sale solution seems unlikely -- publication bias will always be there in one form or another -- but the huge exacerbation of this inflation that arises in fMRI (due to the sheer dimensionality of the data) can be solved: 

Use one portion of the data to reduce the dimensionality to a much smaller set of ROIs, and the other part of the data to assess the effect size (either using independent localizers, or using cross-validation methods).  

Using independent localizers to reduce the dimensionality of fMRI data (by several orders of magnitude) will go a long way to minimizing the problem:  It will still be the case that the reported significant effects in any given ROI will be inflated, like all significant effects are inflated, but much less so.  Moreover, if the dimensionality of fMRI is reduced sufficiently, there will only be a few ROIs, and reporting confidence intervals on non-significant effects ceases to be wildly impractical...

Cheers,
Ed</description>
		<content:encoded><![CDATA[<p>Hi Tal,</p>
<p>That&#8217;s a nice write-up, but I&#8217;m wondering why you think there isn&#8217;t a practical &#8220;solution&#8221;?  A whole-sale solution seems unlikely &#8212; publication bias will always be there in one form or another &#8212; but the huge exacerbation of this inflation that arises in fMRI (due to the sheer dimensionality of the data) can be solved: </p>
<p>Use one portion of the data to reduce the dimensionality to a much smaller set of ROIs, and the other part of the data to assess the effect size (either using independent localizers, or using cross-validation methods).  </p>
<p>Using independent localizers to reduce the dimensionality of fMRI data (by several orders of magnitude) will go a long way to minimizing the problem:  It will still be the case that the reported significant effects in any given ROI will be inflated, like all significant effects are inflated, but much less so.  Moreover, if the dimensionality of fMRI is reduced sufficiently, there will only be a few ROIs, and reporting confidence intervals on non-significant effects ceases to be wildly impractical&#8230;</p>
<p>Cheers,<br />
Ed</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tal</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-15</link>
		<dc:creator>tal</dc:creator>
		<pubDate>Sun, 22 Nov 2009 17:23:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-15</guid>
		<description>Hi Neuroskeptic,

Thanks for the comments and compliment!

&lt;em&gt;It’s true that no journal is going to publish all of the data from an fMRI experiment but – do we need journals to report fMRI results? Why couldn’t we make all of our results available online?&lt;/em&gt;

I totally agree with this idea, and it&#039;s actually something I talk about as really being the ultimate long-term solution when I give talks on this stuff (and also in the paper I&#039;m working on). Ultimately, the only (or at least, best) way to solve the problem is to dump all data, significant or not, in a massive database somewhere. But the emphasis in my post was on &quot;easy&quot; fixes, which this certainly isn&#039;t. I think the main problem isn&#039;t a technical one--there already are online databases (e.g., BrainMap and SuMS DB) that could in theory be modified to handle complete maps instead of coordinates--it&#039;s a sociological one. It&#039;s going to be pretty difficult to convince researchers to (a) allow their raw data to be submitted to a repository and (b) actually convince them to do it themselves (since it would inevitably entail filling out a bunch of meta-data for each map). I&#039;m not saying this isn&#039;t the way to go, just that it&#039;s not going to be an easy sell.

The Journal of Cognitive Neuroscience already experimented with something like this back when they required authors to submit their raw data along with manuscripts. I think the general verdict is that it was a nice idea that didn&#039;t really work in practice, because (a) it was a pain in the ass to process, (b) authors weren&#039;t happy with it, and (c) hardly anyone actually requested or used the data. These aren&#039;t insurmountable problems, and to some extent they&#039;d be ameliorated by having an online database rather than an off-line one, but nonetheless, I think there are some big challenges involved...</description>
		<content:encoded><![CDATA[<p>Hi Neuroskeptic,</p>
<p>Thanks for the comments and compliment!</p>
<p><em>It’s true that no journal is going to publish all of the data from an fMRI experiment but – do we need journals to report fMRI results? Why couldn’t we make all of our results available online?</em></p>
<p>I totally agree with this idea, and it&#8217;s actually something I talk about as really being the ultimate long-term solution when I give talks on this stuff (and also in the paper I&#8217;m working on). Ultimately, the only (or at least, best) way to solve the problem is to dump all data, significant or not, in a massive database somewhere. But the emphasis in my post was on &#8220;easy&#8221; fixes, which this certainly isn&#8217;t. I think the main problem isn&#8217;t a technical one&#8211;there already are online databases (e.g., BrainMap and SuMS DB) that could in theory be modified to handle complete maps instead of coordinates&#8211;it&#8217;s a sociological one. It&#8217;s going to be pretty difficult to convince researchers to (a) allow their raw data to be submitted to a repository and (b) actually convince them to do it themselves (since it would inevitably entail filling out a bunch of meta-data for each map). I&#8217;m not saying this isn&#8217;t the way to go, just that it&#8217;s not going to be an easy sell.</p>
<p>The Journal of Cognitive Neuroscience already experimented with something like this back when they required authors to submit their raw data along with manuscripts. I think the general verdict is that it was a nice idea that didn&#8217;t really work in practice, because (a) it was a pain in the ass to process, (b) authors weren&#8217;t happy with it, and (c) hardly anyone actually requested or used the data. These aren&#8217;t insurmountable problems, and to some extent they&#8217;d be ameliorated by having an online database rather than an off-line one, but nonetheless, I think there are some big challenges involved&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Neuroskeptic</title>
		<link>http://www.talyarkoni.org/blog/2009/11/21/ioannidis-on-effect-size-inflation-with-guest-appearance-by-bozo-the-clown/comment-page-1/#comment-14</link>
		<dc:creator>Neuroskeptic</dc:creator>
		<pubDate>Sun, 22 Nov 2009 13:20:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.talyarkoni.org/blog/?p=80#comment-14</guid>
		<description>Don&#039;t feel too bad - I&#039;m an Ioannidis fan and I&#039;d never heard of this paper before either!

The problem of effect size inflation is pretty easy to understand but as you say, solving it is hard. Still I wonder whether, in the case of fMRI specifically, there isn&#039;t a possible solution...

&lt;i&gt;&quot;While it’d be nice if there was an easy fix for this problem, there really isn’t one. In behavioral domains, there’s often a relatively simple prescription: report all effect sizes, both significant and non-significant. This doesn’t entirely solve the problem, because people are still likely to overemphasize statistically significant results relative to non-significant ones; but at least at that point you can say you’ve done what you can. In the fMRI literature, this course of action isn’t really available, because most journal editors are not going to be very happy with you when you send them a 25-page table that reports effect sizes and p-values for each of the 100,000 voxels you tested.&quot;&lt;/i&gt;

It&#039;s true that no journal is going to publish all of the data from an fMRI experiment but - do we need journals to report fMRI results? Why couldn&#039;t we make all of our results available online?

Suppose you look for a correlation between personality and neural activity in while looking at pictures of clowns. You take 10 personality measures and correlate each one with activity in every voxel. You apply a ridiculously conservative Bonferroni correction and you find some blobs where activity correlates very strongly with personality.

Now if you publish this in a journal, you&#039;re only going to report those blobs where the effect size is very high. Because you will be applying a very conservative threshold. That&#039;s the problem that Ioannidis and you identified. But what you actually have in this example is 10 3D brain volumes where higher values mean more correlation.

Why not make those 3D images your results, and leave it to the reader to threshold them? If the reader then wants to threshold them conservatively they can do so, knowing (hopefully) not to take the effect sizes too seriously, then if they want they can calculate the correlations in particular areas using different thresholds or no thresholds at all... all of which will be appropriate for certain purposes.

It&#039;s always struck me as odd that we publish fMRI data in journals in exactly the same way as 19th century scientists published their results.</description>
		<content:encoded><![CDATA[<p>Don&#8217;t feel too bad &#8211; I&#8217;m an Ioannidis fan and I&#8217;d never heard of this paper before either!</p>
<p>The problem of effect size inflation is pretty easy to understand but as you say, solving it is hard. Still I wonder whether, in the case of fMRI specifically, there isn&#8217;t a possible solution&#8230;</p>
<p><i>&#8220;While it’d be nice if there was an easy fix for this problem, there really isn’t one. In behavioral domains, there’s often a relatively simple prescription: report all effect sizes, both significant and non-significant. This doesn’t entirely solve the problem, because people are still likely to overemphasize statistically significant results relative to non-significant ones; but at least at that point you can say you’ve done what you can. In the fMRI literature, this course of action isn’t really available, because most journal editors are not going to be very happy with you when you send them a 25-page table that reports effect sizes and p-values for each of the 100,000 voxels you tested.&#8221;</i></p>
<p>It&#8217;s true that no journal is going to publish all of the data from an fMRI experiment but &#8211; do we need journals to report fMRI results? Why couldn&#8217;t we make all of our results available online?</p>
<p>Suppose you look for a correlation between personality and neural activity in while looking at pictures of clowns. You take 10 personality measures and correlate each one with activity in every voxel. You apply a ridiculously conservative Bonferroni correction and you find some blobs where activity correlates very strongly with personality.</p>
<p>Now if you publish this in a journal, you&#8217;re only going to report those blobs where the effect size is very high. Because you will be applying a very conservative threshold. That&#8217;s the problem that Ioannidis and you identified. But what you actually have in this example is 10 3D brain volumes where higher values mean more correlation.</p>
<p>Why not make those 3D images your results, and leave it to the reader to threshold them? If the reader then wants to threshold them conservatively they can do so, knowing (hopefully) not to take the effect sizes too seriously, then if they want they can calculate the correlations in particular areas using different thresholds or no thresholds at all&#8230; all of which will be appropriate for certain purposes.</p>
<p>It&#8217;s always struck me as odd that we publish fMRI data in journals in exactly the same way as 19th century scientists published their results.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

