There is no “tone” problem in psychology

Much ink has been spilled in the last week or so over the so-called “tone” problem in psychology, and what to do about it. I speak here, of course, of the now infamous (and as-yet unpublished) APS Observer column by APS Past President Susan Fiske, in which she argues rather strenuously that psychology is in danger of falling prey to “mob rule” due to the proliferation of online criticism generated by “self-appointed destructo-critics” who “ignore ethical rules of conduct.”

Plenty of people have already weighed in on the topic (my favorite summary is Andrew Gelman’s take), and to be honest, I don’t really have (m)any new thoughts to offer. But since that’s never stopped me before, I will now proceed to throw those thoughts at you anyway, just for good measure.

Since I’m verbose but not inconsiderate, I’ll summarize my main points way up here, so you don’t have to read 6,500 more words just to decide that you disagree with me. Basically, I argue the following points:

  1. There is nothing wrong with the general tone of our discourse in psychology at the moment.
  2. Even if there was something wrong with the tone of our discourse, it would be deeply counterproductive to waste our time talking about it in vague general terms.
  3. Fear of having one’s scientific findings torn apart by others is not unusual or pathological; it’s actually a completely normal–and healthy–feeling for a scientist.
  4. Appeals to fairness are not worth taking seriously unless the argument is pitched at the level of the entire scientific community, rather than just the sub-community one happens to belong to.
  5. When other scientists do things we don’t like, it’s pointless and counterproductive to question their motives.

There, that’s about as much of being brief and to the point as I can handle. From here on out, it’s all adjective soup, mixed metaphor, and an occasional literary allusion*.

1. There is no tone problem

Much of the recent discussion over how psychologists should be talking to one another simply takes it for granted that there’s some deep problem with the tone of our scientific discourse. Personally, I don’t think there is (and on the off-chance we’re doing this by vote count, neither do Andrew Gelman, Chris Chambers, Sam Schwarzkopf, or NeuroAnaTody). At the very least, I haven’t seen any good evidence for it. As far as I can tell, all of the complaints about tone thus far have been based exclusively on either (a) a handful of rather over-the-top individual examples of bad behavior, or (b) vague but unsupported allegations that certain abusive practices are actually quite common. Neither of these constitutes a satisfactory argument, in my view. The former isn’t useful because anecdotes are just that. I imagine many people can easily bring to mind several instances of what seem like unwarranted attacks on social media. For example, perhaps you don’t like the way James Coyne sometimes calls out people he disagrees with:

Or maybe you don’t appreciate Dan Gilbert calling a large group of researchers with little in common except their efforts to replicate one or more studies as “shameless little bullies”:

I don’t doubt that statements like these can and do offend some people, and I think people who are offended should certainly feel free to publicly raise their concerns (ideally by directly responding to the authors of such remarks). Still, such cases are the exception, not the norm, and academic psychologists should appreciate better than most people the dangers of over-generalizing from individual cases. Nobody should labor under any misapprehension that it’s possible to have a field made up of thousands of researchers all going about their daily business without some small subset of people publicly being assholes to one another. Achieving zero instances of bad behavior cannot be a sane goal for our field (or any other field). When Dan Gilbert called replicators “second-stringers” and “shameless little bullies,” it did not follow that all social psychologists above the age of 45 are reactionary jackasses. For that matter, it didn’t even follow that Gilbert is a jerk. The correct attributions in such cases–until such time as our list of notable examples grows many times larger than it presently is–are that (a) reasonable people sometimes say unreasonable things they later regret, or (b) some people are just not reasonable, and are best ignored. There is no reason to invent a general tone problem where none exists.

The other main argument for the existence of a “tone” problem—and one that’s prominently on display in Fiske’s op-ed—is the gossipy everyone-knows-this-stuff-is-happening kind of argument. You could be excused for reading Fiske’s op-ed and coming away thinking that verbal abuse is a rampant problem in psychology. Consider just one paragraph (but the rest of it reads much the same):

Only what’s crashing are people. These unmoderated attacks create collateral damage to targets’ careers and well being, with no accountability for the bullies. Our colleagues at all career stages are leaving the field because of the sheer adversarial viciousness. I have heard from graduate students opting out of academia, assistant professors afraid to come up for tenure, mid-career people wondering how to protect their labs, and senior faculty retiring early, all because of methodological terrorism. I am not naming names because ad hominem smear tactics are already damaging our field. Instead, I am describing a dangerous minority trend that has an outsized impact and a chilling effect on scientific discourse.

I will be the first to admit that it sounds very ominous, all this talk of people crashing, unmoderated attacks with no accountability, and people leaving the field. But before you panic, you might want to consider an alternative paragraph that, at least from where I’m sitting, Fiske could just as easily have written:

Only what’s crashing are people. The proliferation of flashy, statistically incompetent findings creates collateral damage to targets’ careers and well being, with no accountability for the people who produce such dreck. Our colleagues at all career stages are leaving the field due to the sheer atrocity of its standards. I have heard from graduate students opting out of academia, assistant professors suffering from depression, mid-career people wondering how to sustain their research, and senior faculty retiring early, all because of their dismay at common methodological practices. I am not naming names because ad hominem smear tactics are already damaging our field. Instead, I am describing a dangerous trend that has an outsized impact and a chilling effect on scientific progress.

Or if you don’t like that one, maybe this one is more your speed:

Only what’s crashing are our students. These unmoderated attacks on students by their faculty advisors create collateral damage to our students, with no accountability for the bullies. Our students at all stages of graduate school are leaving the field because of the sheer adversarial viciousness. I have heard from graduate students who work 90-hour weeks, are afraid to have children at this stage of their careers, or have fled grad school, all out of fear of being terrorized by their advisors. I am not naming names because ad hominem smear tactics are already damaging our field. Instead, I am describing a dangerous trend that has an outsized impact and a chilling effect on scientific progress.

If you don’t like that one either, feel free to crib the general structure and play fill in the blank with your favorite issue. It could be low salaries, unreasonable publication expectations, or excessively high teaching loads; whatever you like. The formula is simple: first, you find a few people with (perfectly legitimate) concerns about some aspect of their professional environment; then you just have to (1) recount those stories in horrified tones, (2) leave out any mention of exactly how many people you’re talking about, (3) provide no concrete details that would allow anyone to see any other side to the story, and (4) not-so-subtly imply that all hell will break loose if this problem isn’t addressed some time real soon.

Note that what makes Fiske’s description unproductive and incendiary here is not that we have any reason to doubt the existence of the (anonymous) cases she alludes to. I have no doubt that Fiske does in fact hear regularly from students who have decided to leave academia because they feel unfairly targeted. But the thing is, it’s also an indisputable fact that many (in absolute terms) students leave academia because they have trouble getting along with their advisors, because they’re fed up with the low methodological standards in the field, or because they don’t like the long, unstructured hours that science requires.

The problem is not that Fiske is being untruthful; it’s that she’s short-circuiting the typical process of data- and reason-based argument by throwing lots of colorful anecdotes and emotional appeals at us. No indication is provided in her piece—or in my creative adaptations—as to whether the scenarios described are at all typical. How often, we should be asking ourselves, does it actually happen that people opt out of academia, or avoid seeking tenure, because of legitimate concerns about being unfairly criticized by their colleagues? How often do people leave the field because our standards are so terrible? Just how many psychology faculty are really such terrible advisors that their students regularly quit? If the answer to all of these questions is “extremely rarely”–or if there is reason to believe that in many cases, the story is not nearly as simple as the way Fiske is making it sound–then we don’t have systematic problems that deserves our collective attention; at worst, we have isolated cases of people behaving badly. Unfortunately, the latter is a malady that universally afflicts every large group or organization, and as far as I know, there is no known cure.

From where I’m sitting, there is no evidence of an epidemic of interpersonal cruelty in psychology. There has undeniably been a rapid increase in open, critical commentary online; but as Chris Chambers, Andrew Gelman, and others have noted, this is much better understood as a welcome democratization of scientific discourse that levels the playing field and devalues the role of (hard-earned) status than some kind of verbal war to the pain between rival psychological ideologies.

2. Three reasons why complaining about tone is a waste of time

Suppose you disagree with my argument above (which is totally cool—please let me know why in the comments below!) and insist that there clearly is a problem with the tone of our discourse. What then? Well, in that case, I would still respectfully suggest that if your plan for dealing with this problem is to complain about it in general terms, the way Fiske does—meaning, without ever pointing to specific examples or explaining exactly what you mean by “critiques of such personal ferocity” or “ad hominem smear tactics”—then you’re probably just wasting your time. Actually, it’s worse than that: not only are you wasting your own time, but you’re probably also going to end up pouring more fuel on the very fire you claim to be trying to put out (and indeed, this is exactly what Fiske’s op-ed seems to have accomplished).

I think there are at least three good reasons to believe that spending one’s time arguing over tone in abstract terms is a generally bad idea. Since I appear to have nothing but time, and you appear to still be reading this, I’ll discuss each of them in great gory detail.

The engine-on-fire view of science

First, unlike in many other domains of life, in science, the validity or truth value of a particular viewpoint is independent of the tone with which that viewpoint is being expressed. We can perhaps distinguish between two ways of thinking about what it means to do science. One approach is what we might call the negotiation model of science. On this model, when two people disagree over some substantive scientific issue, what they’re doing is trying to find a compromise position that’s palatable to both parties. If you say your finding is robust, and I say it’s totally p-hacked, then our goal is to iterate until we end up in a position that we both find acceptable. This doesn’t necessarily mean that the position we end up with must be an intermediate position (e.g., “okay, you only p-hacked a tiny bit”); it’s possible that I’ll end up entirely withdrawing my criticism, or that you’ll admit to grave error and retract your study. The point is just that the goal is, at least implicitly, to arrive at some consensual agreement between parties regarding our original disagreement.

If one views science through this kind of negotiation lens, concerns about tone make perfect sense. After all, in almost any other context when you find yourself negotiating with someone, it’s a generally bad idea to start calling them names or insulting their mother. If you’re hawking your goods at a market, it’s probably safe to assume that every prospective buyer has other options–they can buy whatever it is they need from some other place, and they don’t have to negotiate specifically with you if they don’t like the way you talk to them. So you watch what you say. And if everyone manages to get along without hurling insults, it’s possible you might even successfully close a deal, and go home one rug lighter and a few Euros richer.

Unfortunately, the negotiation model isn’t a good way to think about science, because in science, the validity of one’s views does not change in any way depending on whether one is dispositionally friendly, or perpetually acts like a raging asshole. A better way to think about science is in terms of what we might call, with great nuance and sophistication, the “engine-on-fire” model. This model can be understood as follows. Suppose you get hungry while driving a long distance, and pull into a convenience store to buy some snacks. Just as you’re opening the door to the store, some guy yells out behind you, “hey, asshole, your engine’s on fire!” He then continues to stand around and berate you while you call for emergency services and frantically run around looking for a fire extinguisher–all without ever lifting a finger to help you.

Two points about this story should be obvious. First, the guy who alerted you to your burning engine is very likely a raging asshole. And second, the fact that he’s a raging asshole doesn’t absolve you in any way from taking steps to put out your flaming engine. It may absolve you from saying thank you to him after the fact, but his unpleasant demeanor unfortunately doesn’t mean you can just choose to look the other way out of spite, and calmly head inside to buy your teriyaki beef jerky as the flames outside engulf your vehicle.

For better or worse, scientific disagreements are more like the engine-on-fire scenario than the negotiation scenario. Superficially, it may seem that two people with a scientific disagreement are in a process of negotiation. But a crucial difference is that if one person inexplicably decides to start yelling at the other–even as they continue to toss out methodological or theoretical criticisms (“only a buffoon of a scientist could fail to model stimulus as a random factor in this design!”)–their criticisms don’t become any less true in virtue of their tone. This doesn’t mean that tone is irrelevant and should be ignored, of course; if a critic calls you names while criticizing your work, it’s perfectly reasonable for you to object to the tone they’re using, and ask that they avoid personal attacks. Unfortunately, you can’t compel them be nice to you, and the fact remains that if your critic decides to keep yelling at you, you still have a professional obligation to address the substance of their arguments, no matter how repellent you find their tone. If you don’t respond at all–either by explaining why the concern is invalid, or by adjusting your methodological procedures in some way–then there are now two scientific assholes in the world.

Distinguishing a bad case of the jerks from a bad case of the feels isn’t always easy

Much of the discussion over tone thus far has taken, as its starting point, people’s hurt feelings. Feelings deserve to be taken seriously; scientists are human beings, and the fact that the merit of a scientific argument is indepedendent of the tone used to convey it doesn’t mean we should run roughshod over people’s emotions. The important point to note, though, is that the opposite point also holds: the fact that someone might be upset by someone else’s conduct doesn’t automatically mean that the other party is under any obligation–or even expectation–to change their behavior. Sometimes people are upset for understandable reasons that nevertheless do not imply that anyone else did anything wrong.

Daniel Lakens recently pointed this problem out in a nice blog post. The fundamental point is that it’s often impossible for scientists to cleanly separate substantive intellectual issues from personal reputation and ego, because it’s simply a fact that one’s intellectual output is, to varying extents, a reflection of one’s abilities as a scientist. Meaning, if I consistently put out work that’s heavily criticized by other researchers, there is a point at which that criticism does in fact begin to impugn my general ability as a scientist–even if the criticism is completely legitimate, impersonal, and never strays from substantive discussion of the intellectual issues.

Examples of this aren’t hard to find in psychology. To take just one widely-cited example: among the best-replicated findings in behavioral genetics (and indeed, all of psychology) is the finding that most traits show high heritability (typically on the order of 50%) and little influence of shared environment (typically close to 0%). In other words, an enormous amount of evidence suggests that parents have minimal influence on how their children will eventually turn out, independently of the genes they pass on. Given such knowledge, the scientifically honest thing to do, it would seem, is to assume that most child-parent behavioral correlations are largely driven by heritable factors rather than by parenting. Nevertheless, a large fraction of the developmental literature consists of researchers conducting purely correlational studies and drawing strong conclusions about the causal influence of parenting on children’s behavior on the basis of observed child-parent correlations.

If you think I’m exaggerating, consider the latest issue of Psychological Science, where we find a report of a purely longitudinal study (no randomized experiment, and no behavioral genetic component) that claims to find evidence of “a positive link between more nurturing family environments in childhood and greater security of attachment to spouses more than 60 years later.” The findings, we’re told in the abstract, “…underscore the far-reaching influence of childhood environment on well-being in adulthood.” The fact that 50 years of behavioral genetics studies have conclusively demonstrated that all, or nearly all, of this purported parenting influence is actually accounted for by genetic factors does not seem to deter the authors. The terms “heritable” or “genetic” do not show up anywhere in the article, and no consideration at all is given to the possibility that the putative effect of warm parental environment is at least partly (and quite possibly wholly) spurious. And there are literally thousands of other papers just like this one in the developmental literature–many of them continually published in some of our most prestigious journals.

Now, an important question arises: how is a behavioral geneticist supposed to profesionally interact with a developmental scientist who appears to willfully ignore the demonstrably small influence of parenting, even after it is repeatedly pointed out to him? Is the geneticist supposed to simply smile and nod at the developmentalist and say, “that’s nice, you’re probably right about how important attachment styles are, because after all, you’re a nice person to talk to, and I want to keep inviting you to my dinner parties”? Or should she instead point out—repeatedly, if need be—the critical flaw in purely correlational designs that precludes any serious causal conclusions about parenting? And if she does the latter—always in a perfectly civil tone, mind you—how can that sentiment possibly be expressed in a way that both (a) is taken seriously enough by the target of criticism to effect a meaningful change in behavior, and (b) doesn’t seriously injure the target’s feelings?

This example highlights two important points, I think. First, when we’re being criticized, it can be very difficult to determine whether our critics are being unreasonable jerks, or are instead quite calmly saying things that we just don’t want to hear. As such, it’s a good idea to give our critics the benefit of the doubt, and assume they have fundamentally good intentions, even if our gut response is to retaliate as if they’re trying to cast our firstborn child into a giant lake of fire.

Second, unfortunate as it may be, being a nice person and being a good scientist are often in fundamental tension with one another–and virtually all scientists are frequently forced to choose which of the two they want to prioritize. I’m not saying you can’t be both a nice person and a good scientist on average. Of course you can. I’m just saying that there are a huge number of individual situations in which you can’t be both at the same time. If you ever find yourself at a talk given by one of the authors of the Psychological Science paper I mention above, you will have a choice between (a) saying nothing to the speaker during the question period (a “nice” action that hurts nobody’s feelings, but impedes scientific progress), and (b) pointing out that the chief conclusion expressed during the talk simply does not follow from any of the evidence presented (a “mean” action that will probably hurt the speaker’s feelings, but also serves to brings a critical scientific flaw to the attention of other scientists in the audience).

Now, one could potentially mount a reasonable argument in favor of being either a nice person, or a good scientist. I’m not going to argue that the appropriate thing to do is to always to put science ahead of people’s feelings. Sometimes there can be good reasons to privilege the latter. But I don’t think we should pretend that the tension between good science and good personal relationships doesn’t exist. My own view, for what it’s worth, is that people who want to do science for a living should accept that they are going to be regularly and frequently criticized, and that hurt feelings and wounded egos are part and parcel of being cognitively limited agents with deep emotions who spend their time trying to understand something incredibly difficult. This doesn’t mean that it’s okay to yell at people or call them idiots in public–it isn’t, and we should work hard collectively to prevent such behavior. But it does mean that at some point in one’s scientific career–and probably at many, many points–one may have the distinctly unpleasant experience of another scientist saying “I think the kind of work you do is fundamentally not capable of answering the questions you’re asking,” or, “there’s a critical flaw in your entire research program.” In such cases, it’s understandable if one’s feelings are hurt. But hurt feelings don’t in any way excuse one from engaging seriously with the content of the criticism. Listening to people tell us we’re wrong is part of the mantle we assume when we decide to become scientists; if we only want to talk to other people when they agree with us, there are plenty of other good ways we can spend our lives.

Who’s actually listening?

The last reason that complaining about the general tone of discourse seems inadvisable is that it’s not clear who’s actually listening. I mean, obviously plenty of people are watching the current controversy unfold in the hold on, let me get some popcorn sense. But the real question is, who do we think is going to read Fiske’s commentary, or any other commentary like it, and think, you know what–I see now that I’ve been a total jerk until now, and I’m going to stop? I suspect that if we were to catalogue all the cases that Fiske thinks of as instances of “ad hominem smear tactics” or “public shaming and blaming”, and then ask the perpetrators for their side of the story, we would probably get a very different take on things. I imagine that in the vast majority of cases, what people like Fiske see as behavior that’s completely beyond the pale would be seen by the alleged perpetrators as harsh but perfectly reasonable criticism–and apologies or promises to behave better in future would probably not flow very freely.

Note that I’m emphatically not suggesting that the actions in question are always defensible. I’m not passing any judgment on anyone’s behavior at all. I have no trouble believing that in some of the cases Fiske alludes to, there are probably legitimate and serious causes for concern. But the problem is, I see no reason to think that in cases where someone really is being an asshole, they’re likely to stop being an asshole just because Fiske wrote an op-ed complaining about tone in general terms. For example, I personally don’t think Andrew Gelman’s criticism of Cuddy, Norton, or Fiske has been at all inappropriate; but supposing you do think it’s inappropriate, do you really think Gelman is going to stop vigorously criticizing research he disagrees with just because Fiske wrote a column calling for civility?

We therefore find ourselves in a rather unfortunate situation: Fiske’s appeal is likely to elicit both heartfelt nods of approval from anyone who feels they’ve ever been personally attacked by a “methodological terrorist”, and shrieks of indignation and moral outrage from anyone who feels Fiske is mistaking their legitimate criticism for personal abuse. What it’s not likely to elicit much of is serious self-reflection or change in behavior—if for no other reason that it doesn’t describe any behavior in sufficient detail that anyone could actually think, “oh, yes, I see how that could be perceived as a personal attack.” In trying to avoid “damaging our field” by naming names, Fiske has, ironically, ended up writing a deeply divisive piece that appears to have only fanned the flames. I don’t think this is an accident; it seems to me like the inevitable fate of any general call for civility of this kind that fails to actually define or give examples of the behavior that is supposed to be so offensive.

The moral of the story is, if you’re going to complain about “critiques of such personal ferocity and relentless frequency that they resemble a denial-of-service attack” (and you absolutely should, if you think you have a legitimate case!), then you need to point to concrete behaviors that people can consider, evaluate, and learn from, and not just throw out vague allusions to “public shaming and blaming”, “ignoring ethical rules of conduct”, and “attacking the person and not the work”.

3. Fear of criticism is important—and healthy

Accusations of actual bullying are not the only concern raised by Fiske and other traditionalists. One of the other recurring themes that have come up in various commentaries on the tone of our current discourse is a fear of future criticism–and in particular, of being unfairly “targeted” for attack. In her column, Fiske writes that targets “often seem to be chosen for scientifically irrelevant reasons: their contrary opinions, professional prominence, or career-stage vulnerability.” On its face, this concern seems reasonable: surely it would be a bit unseemly for researchers to go running around gunning for each another purely to satisfy their petty personal vendettas. Science is supposed to be about the pursuit of truth, not vengeance!

Unfortunately, there is, so far as I can see, no possible way to enforce an injunction against pettiness or malicious intent. Nor should we want to try, because that would require a rather active form of thought policing. After all, who gets to decide what was in my head when I set out to replicate someone else’s study? Do we really want editors or reviewers passing judgment on whether an author’s motives for conducting a study were pure–and using that as a basis to discount the actual findings reported by the study? Does that really seem to Fiske like a good way to improve the tone of scientific discourse?

For better or worse, researchers do not–and cannot–have any right not to fear being “targeted” by other scientists–no matter what the motives in question may be. To the contrary, I would argue that a healthy fear of others’ (possibly motivated) negative evaluations is a largely beneficial influence on the quality of our science. Personally, I feel a not-insubstantial amount of fear almost any time I contemplate the way something I’ve written will be received by others (including these very words–as I’m writing them!). I frequently ask myself what I myself would say if I were reading a particular sentence or paragraph in someone else’s paper. And if the answer is “I would criticize it, for the following reasons…”, then I change or remove the offending statement(s) until I have no further criticisms. I have no doubt that it would do great things for my productivity if I allowed myself to publish papers as if they were only going to be read by friendly, well-intentioned colleagues. But then the quality of my papers would also decrease considerably. So instead, I try to write papers as if I expect them to be read by a death panel with a 90% kill quota. It admittedly makes writing less fun, but I also think it makes the end product much better. (The same principle also applies when seeking critical feedback on one’s work from others: if you only ever ask friendly, pleasant collaborators for their opinion on your papers, you shouldn’t be surprised if anonymous reviewers who have no reason to pull their punches later take a somewhat dimmer view.)

4. Fairness is in the eye of the beholder

Another common target of appeal in arguments about tone is fairness. We find fairness appeals implicitly in Fiske’s op-ed (presumably it’s a bad thing if some people switch careers because of fear of being bullied), and explicitly in a number of other commentaries. The most common appeal is to the negative career consequences of being (allegedly) unfairly criticized or bullied. The criticism doesn’t just impact on one’s scientific findings (goes the argument); it also makes it less likely that one will secure a tenure-track position, promotion, raise, or speaking invitations. Simone Schnall went so far as to suggest that the public criticism surrounding a well-publicized failure to replicate one of her studies made her feel like “a criminal suspect who has no right to a defense and there is no way to win.”

Now, I’m not going to try to pretend that Fiske, Schnall, and others are wrong about the general conclusion they draw. I see no reason to deny Schnall’s premise that her career has suffered as a result of the replication failure (though I would also argue that the bulk of that damage is likely attributable to the way she chose to respond to that replication failure, rather than to the actual finding itself). But the critical point here is, the fact that Schnall and others have suffered as a result of others’ replication failures and methodological criticisms is not in and of itself any kind of argument against those replication efforts and criticisms. No researcher has a right to lead a successful career untroubled and unencumbered by any serious questioning of their findings. Nor do early-career researchers like Alec Beall, whose paper suggesting that fertile women are more likely to wear red shirts was severely criticized by Andrew Gelman and others. It is lamentably true that incisive public criticism may injure the reputation and job prospects of those whose work has been criticized. And it’s also true that this can be quite unfair, in the sense that there is generally no particular reason why these particular people should be criticized and suffer for it, while other people with very similar bodies of work go unscathed, and secure plum jobs or promotions.

But here’s the thing: what doesn’t seem fair at the level of one individual is often perfectly fair–or at least, unavoidable–at the level of an entirely community. As soon as one zooms out from any one individual, and instead surveys the field of psychology as a whole, it becomes clear that the job and reputation markets are, to a first approximation, a zero-sum game. As Gelman and many other people have noted, for every person who doesn’t get a job because their paper was criticized by a “replicator”, there could be three other candidates who didn’t get jobs because their much more methodologically rigorous work took too long to publish and/or couldn’t stack up in flashiness to the PR-grabbing work that did win the job lottery. At an individual level, neither of these outcomes is “fair”. But then, very little in the world of professional success–in any field–is fair; almost every major professional outcome, good or bad, is influenced by an enormous amount of luck, and I would argue that it is delusional to pretend otherwise.

At root, I think the question we should ask ourselves, when something good or bad happens, is not: is it fair that I got treated [better|worse] than the way that other person over there was treated? Instead, it should be: does the distribution of individual outcomes we’re seeing align well with what maximizes the benefit to our community as a whole? Personally, I find it very difficult to see trenchant public criticism of work that one perceives as sub-par as a bad thing–even as I recognize that it may seem deeply unfair to the people whose work is the target of that criticism. The reason for this is that an obvious consequence of an increasing norm towards open, public criticism of people’s work is that the quality of our work will, collectively, improve. There should be no doubt that this shift will entail a redistribution of resources: the winners and losers under the new norm will be different from the winners and losers under the old norm. But that observation provides no basis for clinging to the old norm. Researchers who don’t like where things are currently headed cannot simply throw out complaints about being “unfairly targeted” by critics; instead, they need to articulate principled arguments for why a norm of open, public scientific criticism would be bad for science as a whole–and not just bad for them personally.

5. Everyone but me is biased!

The same logic that applies to complaints about being unfairly targeted also applies, I think, to complaints about critics’ nefarious motives or unconscious biases. To her credit, Fiske largely avoids imputing negative intent to her perceived adversaries–even as she calls them all kinds of fun names. Other commentators, however, have been less restrained–for example, suggesting that “there’s a lot of stuff going on where there’s now people making their careers out of trying to take down other people’s careers”, or that replicators “seem bent on disproving other researchers’ results by failing to replicate”. I find these kinds of statements uncompelling and, frankly, unseemly. The reason they’re unseemly is not that they’re wrong. Actually, they’re probably right. I don’t doubt that, despite what many reformers say, some of them are, at least some of the time, indeed motivated by personal grudges, a desire to bring down colleagues of whom they’re envious, and so on and so forth.

But the thing is, those motives are completely irrelevant to the evaluation of the studies and critiques that these people produce. The very obvious reason why the presence of bias on the part of a critic cannot be grounds to discount an study is that critics are not the only people with biases. Indeed, applying such a standard uniformly would mean that nobody’s finding should ever be taken seriously. Let’s consider just a few of the incentives that could lead a researcher conducting novel research, and who dreams of publishing their findings in the hallowed pages of, say, Psychological Science, to cut a few corners and end up producing some less-than-reliable findings:

  • Increased productivity: It’s less work to collect small convenience samples than large, representative ones.
  • More compelling results: Statistically significant results generated in small samples are typically more impressive-looking than one’s obtained from very large samples, due to sampling error and selection bias.
  • Simple stories: The more one probes a particular finding, the greater the likelihood that one will identify some problem that questions the validity of the results, or adds nuance and complexity to an otherwise simple story. And “mixed” findings are harder to publish.

All of these benefits, of course, feed directly into better prospects for fame, fortune, jobs, and promotions. So the idea that a finding published in one of our journals should be considered bias-free because it happened to come first, while a subsequent criticism or replication of that finding should be discounted because of personal motives or other biases is, frankly, delusional. Biases are everywhere; everyone has them. While this doesn’t mean that we should ignore them, it does mean that we should either (a) call all biases out equally–which is generally impossible, or at the very least extremely impractical–or (b) accept that doing so is not productive, and that the best way to eliminate bias over the long term is to pit everyone’s biases against each other and let logical argument and empirical data decide who’s right. Put differently, if you’re going to complain that Jane Doe is clearly motivated to destroy your cherished finding in order to make a name for herself, you should probably preface such an accusation with the admission that you obviously had plenty of motivation to cut corners when you produced the finding in the first place, since you knew it would help you make a name for yourself. Asymmetric appeals that require one to believe that bias exists in only one group of people simply don’t deserve to be taken seriously.

Personally, I would suggest that we adopt a standard policy of simply not talking about other people’s motivations or biases. If you can find evidence of someone’s bias in the methods they used or the analyses they conducted, then great–you can go ahead and point out the perceived flaws. That’s just being a good scientist. But if you can’t, then what was in your (perceived) adversary’s head when she produced her findings is quite irrelevant to scientific discourse–unless you think it would be okay for your critics to discount your work on the grounds that you clearly had all kinds of incentives to cheat.

Conclusions

Uh, no. No conclusions this time–this post is already long enough as is. And anyway, I already posted all of my conclusions way back at the beginning. So you can scroll all the way up there if you want to read them again. Instead, I’m going to try to improve your mood a tiny bit (if not the tone of the debate) by leaving you with this happy little painting automagically generated by Bot Ross:


* I lied! There were no literary allusions.

22 thoughts on “There is no “tone” problem in psychology”

  1. Great post Tal. I think it is really helpful to understand that nice people can still disagree with elements of scientific discovery. I think it can be difficult to do this, but ultimately makes for a happier and more productive scientists all round.

  2. I generally agree but it seems that you imagine there is a bright line between ad hominem and the content of the criticism, the former being what people ought to ignore and the latter being what people ought to pay attention to. What about cases where the criticism implies that researchers have engaged or are engaging in some form of misconduct? At this point any scientist with a pulse should be aware that p-hacking and HARK-ing aren’t kosher, and continuing to do so going forward warrants severe criticism; but a lot of the work that is failing to replicate was conducted when researchers were less aware of how problematic these practices were, and yet the researchers do face comments and critiques that imply they should’ve known better and are worthy of contempt for polluting the field with garbage results. They then get criticized for reacting badly when people use them as poster children for poor practices (even though we know that so many of us have engaged in those very same practices in the past!) It’s the holier-than-thou tone that some criticisms are accompanied by that I think some people find problematic and that probably increase the defensive posture of the researchers whose work is being targeted.

    That said, I agree that whining about it is pointless, although it’s quite possible that if we want people to stop defending and trying to prop up years of bad work then we should care more about how and what we communicate. I disagree that much of the criticism can be clearly broken down into ad hominem and substantive constituent parts.

    1. Thanks for the comment, nonpartisanbystander. I don’t think I suggested or implied anywhere that I think there’s a hard line between appropriate and inappropriate comments (and I actually didn’t even use the phrase “ad hominem” except as a direct quote from Fiske). If you’re referring to the section about still having to respond to the criticism, even when the tone sucks, my point there wasn’t that people should take issue with ad hominem attacks but allow everything else to go uncommented on; it’s that one should feel free to comment on anything at all that strikes one as problematic (tone included), but that independently of that, one still has to respond to the substance of the criticism. So, for example, if someone reads your paper and says something like “frankly, I don’t trust these findings, because the authors didn’t preregister them; they seem completely p-hacked, and it’s ridiculous to think a significant 3-way interaction from a sample of 20 people can be taken seriously”, you’re perfectly welcome to explicitly take issue with that person’s tone if you like. But you can’t dismiss their criticism out of hand simply you think they’re being a jerk. If you think they’re accusing you of misconduct, you’re well within your rights to bring that up as part of the discussion, but there’s still a valid scientific criticism being posed that needs to be addressed as well (I mean, it’s true that 3-way interactions from tiny samples are generally very dubious).

      As an aside, I think we should be very careful not to suggest that p-hacking is something that used to be widespread, and has now all but disappeared (I’ve seen a number of other such suggestions to this effect recently). My general impression is that most findings being produced right now continue to be overfitted to a fairly significant extent. Frankly, it’s hard to see how it could be otherwise, because in the absence of preregistration, it’s actually all but impossible to avoid overfitting. It’s incredibly difficult to avoid fooling one’s self when it comes to data analysis. As Andrew Gelman nicely shows in his “garden of forking paths” paper, you can overfit your data severely even if you only ever do a single analysis (and of course, most people don’t just do a single analysis). So I think this is another place where people clearly understand p-hacking to mean different things, and it’s a good idea to interpret other people’s comments charitably, if possible. Strictly speaking, any flexibility in analysis that distorts a p-value is p-hacking. This could be as benign as discarding a clear outlier from the sample (say, z-score deviation of 4) after looking at the data (as opposed to stipulating in advance exactly what the outlier removal procedure would be). In my experience, most people don’t think of this as p-hacking, so they get upset when someone suggests they p-hacked. When actually, virtually everyone p-hacks all the time, and the important question is to what degree one p-hacked. (And this is also why public preregistration is so important–it’s arguably the only way to ensure that one is completely unable to fool one’s self.)

      1. Oh, and regarding the suggestion that “if we want people to stop defending and trying to prop up years of bad work then we should care more about how and what we communicate”: it’s a valid point, but I have very mixed feelings about this, having seen it go both ways over the years. Meaning, for roughly every case where I can think of someone’s criticism being brushed off inappropriately purely due to tone concerns, I can think of another case where the target of the criticism didn’t bother to even acknowledge the criticism until names were being named. Ultimately it’s an open empirical question as to what really works best to promote constructive dialog, but I’m not at all confident that the answer is that we should try to be nice to each other because that’s more productive.

        For example, after reading my post, today James Coyne tweeted that it was “…interesting that my intemperate tweet he cites marked turning point in tolerance of abuse of mecfs patients. I’m proud.” It’s not at all obvious to me that Coyne is wrong about this. His tone in going after the authors of the study in question was extremely aggressive, and sometimes clearly crossed the line into ad hominem territory; yet he may very well be right that if not for his aggressive pursuit of what he saw as the truth, there would never have been any follow-up, because it is much easier for researchers to brush aside or ignore criticism when it’s framed in a pleasant way. Anyway, it’s a complicated issue.

        1. Thanks for your reply. Yes, I get that you’re saying that people should feel free to comment on anything but should not evade the substantive concern on account of disliking the tone in which it was expressed. My point was that whatever problem of tone there is, it is often more complex than name calling. But fair enough, it doesn’t sound like you intended to imply otherwise, and I agree that arguing that tone is a major problem is misguided and (imho) an attempt at deflection from the real problems the field is grappling with.

          However, as I said, it may still be something worth thinking about further if many people are getting offended and reacting defensively. You say “I’m not at all confident that the answer is that we should try to be nice to each other because that’s more productive.” Agreed. I just threw it out there that we ought to be more reflective about this (we’re psychologists after all!), that is, if we care about changing the norms and getting more buy in. While I’m skeptical that calling individuals out aggressively is the right approach, I don’t think being nice is the answer either. That said, it may be that we don’t need to worry too much about this and that regardless of what we do the norms will continue to change and those who remain defensive will be left behind.

          For example — and maybe I’m too optimistic here — I found it noteworthy that although Amy Cuddy is still staunchly defending power pose effects, she did feel it important to state that she’s on board with the increasingly rigorous standards of the field. This has happened in other corners as well. Of course, one could say that her sentiments are not genuine, but I think it’s a sign that things are going in the right direction if she feels the need to signal agreement.

          Really good point re: p-hacking. I think it’s my bias to assume people most researchers are aware of what exactly is covered by this term, but I think that has much to do with who I am surrounded with. Not to say that I assume that most have stopped p-hacking or are trying to. I totally agree with you that without preregistration this is unlikely.

  3. Hi Tal,

    Great thoughts. And I basically agree with everything you said, in a vacuum. I agree that the tone problem does not (to me) seem to be as big of a thing Fiske seems to have made it out to be. I agree that talking about tone offenses in general terms is bad. I agree that criticism is a crucial part of science, and it’s difficult to disentangle criticism of an idea from criticism of one’s worth as a scientist.

    However, it’s that last part that makes me pause a bit with respect to dismissing the ‘tone problem’ entirely.

    You write:

    > does the distribution of individual outcomes we’re seeing align well with what maximizes the benefit to our community as a whole?

    I like this idea. A healthy criticism of science is no doubt good for advancing the field. But I say that while sitting in a place of privilege. I’m a white man working at a top research university in the United States. Having people criticize my work hurts, yeah. But I think my experience pales in comparison to a young person of color who does not hold the privilege of being associated with a name-brand institution. And this is especially true if they’re not a man. The effects of the same criticism are going to be a bit different than if they were leveled at me. I think those individuals would be much more likely to react to this criticism by thinking that maybe this field is not for people like them – something that would be reinforced if they look around at the broader academic environment. The identities of the senior folks in the field, the most vocal contributors to discussions in talks, meetings, and in social media would suggest the same thing.

    So, I think we should be careful by just dismissing this as entirely out of hand. To be clear, I’m *not* suggesting that we shouldn’t criticize some science because of who is doing it. But I think it’s important to keep in mind that aside from improving the quality of the science, we could also be inadvertently reducing the quality of *scientists*.

    What to do about this? Well, using your engine-on-fire view (which I loved, btw), I think there’s a third option here in addition to a) putting out the fire and calling the guy out as an asshole or b) just going to get your beef jerky anyway. The third option involves the rest of us watching this exchange. If someone is a raging asshole, then they should be called out publicly by the community, and this should *not* be left to the person who is trying to put out their fire. And if the science needs to be criticized, those of us who are not assholes, should do it in the most supportive way possible, especially if the criticism is rolling downhill from those of us sitting up on our hill of privilege.

    -Travis

  4. Travis, I completely agree that we want to work to maximize diversity. But I’m not sure I agree that a norm of open criticism is likely to reduce diversity. A priori, it seems just as likely to me (actually, probably more likely) that the shift towards openness will increase diversity of viewpoints, since the barrier to entry is much lower. The traditional incentive structure seems to me to have very much of a rich-get-richer flavor, where if you’re at an elite institution and have managed to build a solid CV over a decade or two, your path to continued success is much easier. The internet is, if anything, a destabilizing medium, in that it allows anyone–regardless of gender, ethnicity, age, etc.–to weigh in with any position on any issue (and typically without anyone even initially knowing what institution you’re from, or what your background is). Indeed, as bloggers like the Neuroskeptic clearly demonstrate, one can exert great influence on scientific discourse while remain entirely pseudonymous for years upon years. So, while I agree with you regarding the values we ought to prioritize, I guess I don’t see any particular reason why a culture of greater openness should lead to reduced diversity or decreased quality of scientists; my expectation is that it’s more likely to do the opposite.

    1. We’re in agreement that a movement towards openness is almost certainly beneficial with regards to diversity of viewpoints. Indeed, I can’t imagine many systems much worse than the old system in which one’s views are unlikely to be heard unless you’re pals with the power-holders. However, that does not mean that a system built on total openness is a magic pill without disparate outcomes.

      I think a parallel can be drawn with varieties of socioeconomic systems. Is Laissez-faire better for society at large than feudalism? Certainly. But that doesn’t also mean that there aren’t some serious problems with the former that need to have attention vocally drawn to them.

      1. Agreed. And as I explicitly said in the post, I think we should collectively call out bad behavior when we see it. But we should focus on the specific cases in question rather than leveling broad and unsubstantiated accusations at some nameless subset of the population in the vague hope that the people we’re thinking of will spontaneously shape up. And unfortunately, that requires naming names and confronting people directly.

    2. I have to agree with Tal here. You cannot “inadvertently reduce the quality of scientists” unless their work was overvalued in the first place. To reinstate the quality and perceived integrity of scientists within the community in general, you first have to redefine the standards of valuable work. If Travis is suggesting tolerating false promotion for the sake of face saving that would seem retrogressive.

  5. Hey Tal, typing this on my iPhone, so please excuse the brevity (and possible typos/auto-correct garbage…)

    I very much enjoyed your post, and think that despite the already many voices, each and every additional one counts and is important!

    The only point I found myself wondering about is to what extent science does not also necessarily contain an element of negotiation. For instance, one could argue (within the framework of NHST) which p-threshold should be (or would be) the most reliable one as a trade-off between the different goals (avoiding false-positives, false-negatives, file-drawer crap, etc.), and I am not sure how such a question would be approached “scientifically”, that is to say in a way in which it could be proven to produce the best outcome.

    Psychology research is to some extent a moving target–the very phenomena we want to study and explain may change over time, given people’s (culturally transmitted) knowledge, and so I am equally not sure how a study, particularly an older one, could ever be scientifically disproven using a newer sample. And while making more precise predictions for future observations is a natural goal of our branch of science, so is the interpretation of findings in terms of mental states, and those can equally be only argued about, not proven (or can they??)

    What do you think?

    1. Jochen, agreed, there’s definitely a (large) element of negotiation involved in doing science. I think the critical point is just to recognize that scientific assertions are objective statements about the nature of the world we live in, so in principle, there’s a fact of the matter about whether or not they’re correct. They’re not like, say, music preferences, where you can try to convince me for five minutes that death metal is awesome, but if I don’t agree with you, you can just write it off as a disagreement with no implications for either of our world views. If I have some substantive criticism of your scientific work, you can’t just ignore it because you don’t like it; no matter how much of an asshole you think I am, you still have a duty to consider what I’m saying (though of course you’re not under any obligation to agree with my criticism). So my point is not so much that bargaining with other people isn’t important, it’s that you can’t look at scientific interactions exclusively through a bargaining lens.

  6. “In other words, an enormous amount of evidence suggests that parents have minimal influence on how their children will eventually turn out, independently of the genes they pass on.”

    I appreciate many of the points you’ve made in this post, Tal. But I want to push you a bit more on this point to see if I fully understand your perspective.

    I have always taken the findings of twin research to suggest that one of the factors that helps explain individual differences in a wide range of phenotypes is genetic variation. But, with a few exceptions (e.g., David Rowe), I haven’t read behavior geneticists as implying that the presence of additive genetic sources of variance mutually excludes the role of parents in shaping children. Is it your position that parental influence only manifests as shared environmental contributions in ACE models? Or that non-shared environmental components cannot reflect, in part, parental influence?

    I’m still undecided on how all of this plays out in the realm of attachment, despite having studied attachment dynamics for many years. There is some work out there, including studies that my collaborator Glenn Roisman and I have published, suggesting that early attachment reflects shared and nonshared environmental components (e.g., Roisman & Fraley, 2008). And, although some of our work suggests that the additive genetic component can be small, that isn’t always the case. For example, in one bivariate analysis we found that the association between parental support and various outcomes could be decomposed into genetic, shared, and non-shared environmental components (Roisman & Fraley, 2012).

    Anyhow, my point isn’t really to summarize contemporary research on attachment, development, and behavior genetics. Even within the field of attachment research there are debates about the extent to which shared vs. non-shared environmental influences matter, whether the variance components vary across measurement instruments and age, and so on. But I did want to suggest the possibility that there may be areas of psychology where people entertain certain hypotheses (e.g., that the quality of the relationship that parents have with their children may matter for some outcomes) not because scholars are “willfully ignoring” research in behavior genetics, but because they are weighing all of the possible options rather than ruling some of those out via fiat. Just as I wouldn’t want attachment researchers to dismiss the possibility that genetic factors play a role in shaping relationship development, I wouldn’t want behavior geneticists to dismiss the possibility that parents and other relationship partners play a role too.

  7. Chris, the problem here is not that many developmental psychologists refuse to accept that parenting has no influence. Reasonable people can debate the precise degree to which parenting matters, and under which conditions. The point is that given decades of work clearly demonstrating that additive genetic factors explain substantially more variance in most phenotypes than shared environment, there is no basis whatsoever for simply assuming that a parent-child correlation implies a causal influence of parenting, with no consideration given to other factors.

    To see how absurd conclusions like the one in the paper I quote are, imagine if instead of interpreting parent-child correlations as clear evidence of parental influence, we simply interpreted it as causal evidence of additive genetic influences, without so much as a mention of the possibility that, hey, maybe parenting matters. Then we’d have quotes like “a positive link between parental genes that produce supportive environments and greater security of attachment to spouses more than 60 years later” (even though nobody actually measured or apportioned the genetic variance at any point). And we would be told that this kind of purely correlational evidence “…underscores the far-reaching influence of parents’ genes on their children’s well-being in adulthood.” I don’t think most developmentalists would let such conclusions go by without objection; I suspect there would probably be blood in the street (metaphorically speaking). Even though, ironically, the evidence for the latter inference is actually much better than the evidence for the former.

    In other words, my complaint is not that many developmental psychologists don’t agree with a naive reading of the behavioral genetics literature; it’s that, in literally thousands of articles, they don’t even pay it lip service. It strikes me as a scientific error of the highest order to infer one particular form of causation from an observed correlation when there is in fact very robust evidence suggesting that an alternative causal explanation is, a priori, much more likely to hold.

    1. Follow-up: after re-reading the section in question, I agree that it was stronger than it needed to be. I edited a few phrases (e.g., “negligible” –> “small”) to make it clearer that my point is not that we know definitively that parenting doesn’t matter at all, but that there’s simply no basis for assuming anything causal about the role of parenting from purely correlational results.

Leave a Reply