No, it’s not The Incentives—it’s you

There’s a narrative I find kind of troubling, but that unfortunately seems to be growing more common in science. The core idea is that the mere existence of perverse incentives is a valid and sufficient reason to knowingly behave in an antisocial way, just as long as one first acknowledges the existence of those perverse incentives. The way this dynamic usually unfolds is that someone points out some fairly serious problem with the way many scientists behave—say, our collective propensity to p-hack as if it’s going out of style, or the fact that we insist on submitting our manuscripts to publishers that are actively trying to undermine our interests—and then someone else will say, “I know, right—but what are you going to do, those are the incentives.”

As best I can tell, the words “it’s the incentives” are magic. Once they’re uttered by someone, natural law demands that everyone else involved in the conversation immediately stop whatever else they were doing, solemnly nod, and mumble something to the effect that, yes, the incentives are very bad, very bad indeed, and it’s a real tragedy that so many smart, hard-working people are being crushed under the merciless, gigantic boot of The System. Then there’s usually a brief pause, and after that, everyone goes back to discussing whatever they were talking about a moment earlier.

Perhaps I’m getting senile in my early middle age, but my anecdotal perception is that it used to be that, when somebody pointed out to a researcher that they might be doing something questionable, that researcher would typically either (a) argue that they weren’t doing anything questionable (often incorrectly, because there used to be much less appreciation for some of the statistical issues involved), or (b) look uncomfortable for a little while, allow an awkward silence to bloom, and then change the subject. In the last few years, I’ve noticed that uncomfortable discussions about questionable practices disproportionately seem to end with a chuckle or shrug, followed by a comment to the effect that we are all extremely sophisticated human beings who recognize the complexity of the world we live in, and sure it would be great if we lived in a world where one didn’t have to occasionally engage in shenanigans, but that would be extremely naive, and after all, we are not naive, are we?

There is, of course,  an element of truth to this kind of response. I’m not denying that perverse incentives exist; they obviously do. There’s no question that many aspects of modern scientific culture systematically incentivize antisocial behavior, and I don’t think we can or should pretend otherwise. What I do object to quite strongly is the narrative that scientists are somehow helpless in the face of all these awful incentives—that we can’t possibly be expected to take any course of action that has any potential, however small, to impede our own career development.

“I would publish in open access journals,” your friendly neighborhood scientist will say. “But those have a lower impact factor, and I’m up for tenure in three years.”

Or: “if I corrected for multiple comparisons in this situation, my effect would go away, and then the reviewers would reject the paper.”

Or: “I can’t ask my graduate students to collect an adequately-powered replication sample; they need to publish papers as quickly as they can so that they can get a job.”

There are innumerable examples of this kind, and they’ve become so routine that it appears many scientists have stopped thinking about what the words they’re saying actually mean, and instead simply glaze over and nod sagely whenever the dreaded Incentives are invoked.

A random bystander who happened to eavesdrop on a conversation between a group of scientists kvetching about The Incentives could be forgiven for thinking that maybe, just maybe, a bunch of very industrious people who generally pride themselves on their creativity, persistence, and intelligence could find some way to work around, or through, the problem. And I think they would be right. The fact that we collectively don’t see it as a colossal moral failing that we haven’t figured out a way to get our work done without having to routinely cut corners in the rush for fame and fortune is deeply troubling.

It’s also aggravating on an intellectual level, because the argument that we’re all being egregiously and continuously screwed over by The Incentives is just not that good. I think there are a lot of reasons why researchers should be very hesitant to invoke The Incentives as a justification for why any of us behave the way we do. I’ll give nine of them here, but I imagine there are probably others.

1. You can excuse anything by appealing to The Incentives

No, seriously—anything. Once you start crying that The System is Broken in order to excuse your actions (or inactions), you can absolve yourself of responsibility for all kinds of behaviors that, on paper, should raise red flags. Consider just a few behaviors that few scientists would condone:

  • Fabricating data or results
  • Regulary threatening to fire trainees in order to scare them into working harder
  • Deliberately sabotaging competitors’ papers or grants by reviewing them negatively

I think it’s safe to say most of us consider such practices to be thoroughly immoral, yet there are obviously people who engage in each of them. And when those people are caught or confronted, one of the most common justifications they fall back on is… you guessed it: The Incentives! When Diederik Stapel confessed to fabricating the data used in over 50 publications, he didn’t explain his actions by saying “oh, you know, I’m probably a bit of a psychopath”; instead, he placed much of the blame squarely on The Incentives:

I did not withstand the pressure to score, to publish, the pressure to get better in time. I wanted too much, too fast. In a system where there are few checks and balances, where people work alone, I took the wrong turn. I want to emphasize that the mistakes that I made were not born out of selfish ends.

Stapel wasn’t acting selfishly, you see… he was just subject to intense pressures. Or, you know, Incentives.

Or consider these quotes from a New York Times article describing Stapel’s unraveling:

In his early years of research — when he supposedly collected real experimental data — Stapel wrote papers laying out complicated and messy relationships between multiple variables. He soon realized that journal editors preferred simplicity. “They are actually telling you: ‘Leave out this stuff. Make it simpler,'” Stapel told me. Before long, he was striving to write elegant articles.

The experiment — and others like it — didn’t give Stapel the desired results, he said. He had the choice of abandoning the work or redoing the experiment. But he had already spent a lot of time on the research and was convinced his hypothesis was valid. “I said — you know what, I am going to create the data set,” he told me.

Reading through such accounts, it’s hard to avoid the conclusion that Stapel’s self-narrative is strikingly similar to the one that gets tossed out all the time on social media, or in conference bar conversations: here I am, a good scientist trying to do an honest job, and yet all around me is a system that incentivizes deception and corner-cutting. What do you expect me to do?.

Curiously, I’ve never heard any of my peers—including many of the same people who are quick to invoke The Incentives to excuse their own imperfections—seriously endorse The Incentives as an acceptable justification for Stapel’s behavior. In Stapel’s case, the inference we overwhelmingly jump to is that there must be something deeply wrong with Stapel, seeing as the rest of us also face the same perverse incentives on a daily basis, yet we somehow manage to get by without fabricating data. But this conclusion should make us a bit uneasy, I think, because if it’s correct (and I think it is), it implies that we aren’t really such slaves to The Incentives after all. When our morals get in the way, we appear to be perfectly capable of resisting temptation. And I mean, it’s not even like it’s particularly difficult; I doubt many researchers actively have to fight the impulse to manipulate their data, despite the enormous incentives to do so. I submit that the reason many of us feel okay doing things like reporting exploratory results as confirmatory results, or failing to mention that we ran six other studies we didn’t report, is not really that The Incentives are forcing us to do things we don’t like, but that it’s easier to attribute our unsavory behaviors to unstoppable external forces than to take responsibility for them and accept the consequences.

Needless to say, I think this kind of attitude is fundamentally hypocritical. If we’re not comfortable with pariahs like Stapel blaming The Incentives for causing them to fabricate data, we shouldn’t use The Incentives as an excuse for doing things that are on the same spectrum, albeit less severe. If you think that what the words “I did not withstand the pressure to score” really mean when they fall out of Stapel’s mouth is something like “I’m basically a weak person who finds the thought of not being important so intolerable I’m willing to cheat to get ahead”, then you shouldn’t give yourself a free pass just because when you use that excuse, you’re talking about much smaller infractions. Consider the possibility that maybe, just like Stapel, you’re actually appealing to The Incentives as a crutch to avoid having to make your life very slightly more difficult.

2. It would break the world if everyone did it

When people start routinely accepting that The System is Broken and The Incentives Are Fucking Us Over, bad things tend to happen. It’s very hard to have a stable, smoothly functioning society once everyone believes (rightly or wrongly) that gaming the system is the only way to get by. Imagine if every time you went to your doctor—and I’m aware that this analogy won’t work well for people living outside the United States—she sent you to get a dozen expensive and completely unnecessary medical tests, and then, when prompted for an explanation, simply shrugged and said “I know I’m not an angel—but hey, them’s The Incentives.” You would be livid—even though it’s entirely true (at least in the United States; other developed countries seem to have figured this particular problem out) that many doctors have financial incentives to order unnecessary tests.

To be clear, I’m not saying perverse incentives never induce bad behavior in medicine or other fields. Of course they do. My point is that practitioners in other fields at least appear to have enough sense not to loudly trumpet The Incentives as a reasonable justification for their antisocial behavior—or to pat themselves on the back for being the kind of people who are clever enough to see the fiendish Incentives for exactly what they are. My sense is that when doctors, lawyers, journalists, etc. fall prey to The Incentives, they generally consider that to be a source of shame. I won’t go so far as to suggest that we scientists take pride in behaving badly—we obviously don’t—but we do seem to have collectively developed a rather powerful form of learned helplessness that doesn’t seem to be matched by other communities. Which is a fortunate thing, because if every other community also developed the same attitude, we would be in a world of trouble.

3. You are not special

Individual success in science is, to a first approximation, a zero-sum game—at least in the short term. many scientists who appeal to The Incentives seem to genuinely believe that opting out of doing the right thing is a victimless crime. I mean, sure, it might make the system a bit less efficient overall… but that’s just life, right? It’s not like anybody’s actually suffering.

Well yeah, people actually do suffer. There are many scientists who are willing to do the right things—to preregister their analysis plans, to work hard to falsify rather than confirm their hypotheses, to diligently draw attention to potential confounds that complicate their preferred story, and so on. When you assert your right to opt out of these things because apparently your publications, your promotions, and your students are so much more important than everyone else’s, you’re cheating those people.

No, really, you are. If you don’t like to think of yourself as someone who cheats other people, don’t reflexively collapse on a crutch made out of stainless steel Incentives any time someone questions your process. You are not special. Your publications, job, and tenure are not more important than other people’s. The fact that there are other people in your position engaging in the same behaviors doesn’t mean you and your co-authors are all very sophisticated, and that the people who refuse to cut corners are naive simpletons. What it actually demonstrates is that, somewhere along the way, you developed the reflexive ability to rationalize away behavior that you would disapprove of in others and that, viewed dispassionately, is clearly damaging to science.

4. You (probably) have no data

It’s telling that appeals to The Incentives are rarely supported by any actual data. It’s simply taken for granted that engaging in the practice in question would be detrimental to one’s career. The next time you’re tempted to blame The System for making you do bad things, you might want to ask yourself this: Do you actually know that, say, publishing in PLOS ONE rather than [insert closed society journal of your choice] would hurt your career? If so, how do you know that? Do you have any good evidence for it, or have you simply accepted it as stylized fact?

Coming by the kind of data you’d need to answer this question is actually not that easy: it’s not enough to reflexively point to, say, the fact that some journals have higher impact factors than others, To identify the utility-maximizing course of action, you’d need to integrate over both benefits and costs, and the costs are not always so obvious. For example, the opportunity cost of submitting your paper to a “good” journal will be offset to some extent by the likelihood of faster publication (no need to spend two years racking up rejections at high-impact venues), by the positive image you send to at least some of your peers that you support open scientific practices, and so on.

I’m not saying that a careful consideration of the pros and cons of doing the right thing would usually lead people to change their minds. It often won’t. What I’m saying is that people who blame The Incentives for forcing them to submit their papers to certain journals, to tell post-hoc stories about their work, or to use suboptimal analytical methods don’t generally support their decisions with data, or even with well-reasoned argument. The defense is usually completely reflexive—which should raise our suspicion that it’s also just a self-serving excuse.

5. It (probably) won’t matter anyway

This one might hurt a bit, but I think it’s important to consider—particularly for early-career researchers. Let’s suppose you’re right that doing the right thing in some particular case would hurt your career. Maybe it really is true that if you comprehensively report in your paper on all the studies you ran, and not just the ones that “worked”, your colleagues will receive your work less favorably. In such cases it may seem natural to think that there has to be a tight relationship between the current decision and the global outcome—i.e., that if you don’t drop the failed studies, you won’t get a tenure-track position three years down the road. After all, you’re focusing on that causal relationship right now, and it seems so clear in your head!

Unfortunately (or perhaps fortunately?), reality doesn’t operate that way. Outcomes in academia are multiply determined and enormously complex. You can tell yourself that getting more papers out faster will get you a job if it makes you feel better, but that doesn’t make it true. If you’re a graduate student on the job market these days, I have sad news for you: you’re probably not getting a tenure-track job no matter what you do. It doesn’t matter how many p-hacked papers you publish, or how thinly you slice your dissertation into different “studies”; there are not nearly enough jobs to go around for everyone who wants one.

Suppose you’re right, and your sustained pattern of corner-cutting is in fact helping you get ahead. How far ahead do you think it’s helping you get? Is it taking you from a 3% chance of getting a tenure-track position at an R1 university to an 80% chance? Almost certainly not. Maybe it’s increasing that probability from 7% to 11%; that would still be a non-trivial relative increase, but it doesn’t change the fact that, for the average grad student, there is no full-time faculty position waiting at the end of the road. Despite what the environment around you may make you think, the choice most graduate students and postdocs face is not actually between (a) maintaining your integrity and “failing” out of science or (b) cutting a few corners and achieving great fame and fortune as a tenured professor. The Incentives are just not that powerful. The vastly more common choice you face as a trainee is between (a) maintaining your integrity and having a pretty low chance of landing a permanent research position, or (b) cutting a bunch of corners that threaten the validity of your work and having a slightly higher (but still low in absolute terms) chance of landing a permanent research position. And even that’s hardly guaranteed, because you never know when there’s someone on a hiring committee who’s going to be turned off by the obvious p-hacking in your work.

The point is, the world is complicated, and as a general rule, very few things—including the number of publications you produce—are as important as they seem to be when you’re focusing on them in the moment. If you’re an early-career researcher and you regularly find yourself strugging between doing what’s right and doing what isn’t right but (you think) benefits your career, you may want to take a step back and dispassionately ask yourself whether this integrity versus expediency conflict is actually a productive way to frame things. Instead, consider the alternative framing I suggested above: you are most likely going to leave academia eventually, no matter what you do, so why not at least try to see the process through with some intellectual integrity? And I mean, if you’re really so convinced that The System is Broken, why would you want to stay in it anyway? Do you think standards are going to change dramatically in the next few years? Are you laboring under the impression that you, of all people, are going to somehow save science?

This brings us directly to the next point…

6. You’re (probably) not going to “change things from the inside”

Over the years, I’ve talked to quite a few early-career researchers who have told me that while they can’t really stop engaging in questionable research practices right now without hurting their career, they’re definitely going to do better once they’re in a more established position. These are almost invariably nice, well-intentioned people, and I don’t doubt that they genuinely believe what they say. Unfortunately, what they say is slippery, and has a habit of adapting to changing circumstances. As a grad student or postdoc, it’s easy to think that once you get a faculty position, you’ll be able to start doing research the “right” way. But once you get a faculty position, it then turns out you need to get papers and grants in order to get tenure (I mean, who knew?), so you decide to let the dreaded Incentives win for just a few more years. And then, once you secure tenure, well, now the problem is that your graduate students also need jobs, just like you once did, so you can’t exactly stop publishing at the same rate, can you? Plus, what would all your colleagues think if you effectively said, “oh, you should all treat the last 15 years of my work with skepticism—that was just for tenure”?

I’m not saying there aren’t exceptions. I’m sure there are. But I can think of at least a half-dozen people off-hand who’ve regaled me with me some flavor of “once I’m in a better position” story, and none of them, to my knowledge, have carried through on their stated intentions in a meaningful way. And I don’t find this surprising: in most walks of life, course correction generally becomes harder, not easier, the longer you’ve been traveling on the wrong bearing. So if part of your unhealthy respect for The Incentives is rooted in an expectation that those Incentives will surely weaken their grip on you just as soon as you reach the next stage of your career, you may want to rethink your strategy. The Incentives are not going to dissipate as you move up the career ladder; if anything, you’re probably going to have an increasingly difficult time shrugging them off.

7. You’re not thinking long-term

One of the most frustrating aspects of appeals to The Incentives is that they almost invariably seem to focus exclusively on the short-to-medium term. But the long term also matters. And there, I would argue that The Incentives very much favor a radically different—and more honest—approach to scientific research. To see this, we need only consider the ongoing “replication crisis” in many fields of science. One thing that I think has been largely overlooked in discussions about the current incentive structure of science is what impact the replication crisis will have on the legacies of a huge number of presently famous scientists.

I’ll tell you what impact it will have: many of those legacies will be completely zeroed out. And this isn’t just hypothetical scaremongering. It’s happening right now to many former stars of psychology (and, I imagine, other fields I’m less familiar with). There are many researchers we can point to right now who used to be really famous (like, major-chunks-of-the-textbook famous), are currently famous-with-an-asterisk, and will in all likelihood, be completely unknown again within a couple of decades. The unlucky ones are probably even fated to become infamous—their entire scientific legacies eventually reduced to footnotes in cautionary histories illustrating how easily entire areas of scientific research can lose their footing when practitioners allow themselves to be swept away by concerns about The Incentives.

You probably don’t want this kind of thing to happen to you. I’m guessing you would like to retire with at least some level of confidence that your work, while maybe not Earth-shattering in its implications, isn’t going to be tossed on the scrap heap of history one day by a new generation of researchers amazed at how cavalier you and your colleagues once were about silly little things like “inferential statistics” and “accurate reporting”. So if your justification for cutting corners is that you can’t otherwise survive or thrive in the present environment, you should consider the prospect—and I mean, really take some time to think about it—that any success you earn within the next 10 years by playing along with The Incentives could ultimately make your work a professional joke within the 20 years after that.

8. It achieves nothing and probably makes things worse

Hey, are you a scientist? Yes? Great, here’s a quick question for you: do you think there’s any working scientist on Planet Earth who doesn’t already know that The Incentives are fucked up? No? I didn’t think so. Which means you really don’t need to keep bemoaning The Incentives; I promise you that you’re not helping to draw much-needed attention to an important new problem nobody’s recognized before. You’re not expressing any deep insight by pointing out that hiring committees prefer applicants with lots of publications in high-impact journals to applicants with a few publications in journals no one’s ever heard of. If your complaints are achieving anything at all, they’re probably actually making things worse by constantly (and incorrectly) reminding everyone around you about just how powerful The Incentives are.

Here’s a suggestion: maybe try not talking about The Incentives for a while. You could even try, I don’t know, working against The Incentives for a change. Or, if you can’t do that, just don’t say anything at all. Probably nobody will miss anything, and the early-career researchers among us might even be grateful for a respite from their senior colleagues’ constant reminder that The System—the very same system those senior colleagues are responsible for creating!—is so fucked up.

9. It’s your job

This last one seems so obvious it should go without saying, but it does need saying, so I’ll say it: a good reason why you should avoid hanging bad behavior on The Incentives is that you’re a scientist, and trying to get closer to the truth, and not just to tenure, is in your fucking job description. Taxpayers don’t fund you because they care about your career; they fund you to learn shit, cure shit, and build shit. If you can’t do your job without having to regularly excuse sloppiness on the grounds that you have no incentive to be less sloppy, at least have the decency not to say that out loud in a crowded room or Twitter feed full of people who indirectly pay your salary. Complaining that you would surely do the right thing if only these terrible Incentives didn’t exist doesn’t make you the noble martyr you think it does; to almost anybody outside your field who has a modicum of integrity, it just makes you sound like you’re looking for an easy out. It’s not sophisticated or worldly or politically astute, it’s just dishonest and lazy. If you find yourself unable to do your job without regularly engaging in practices that clearly devalue the very science you claim to care about, and this doesn’t bother you deeply, then maybe the problem is not actually The Incentives—or at least, not The Incentives alone. Maybe the problem is You.

Neurohackademy 2018: A wrap-up

It’s become something of a truism in recent years that scientists in many fields find themselves drowning in data. This is certainly the case in neuroimaging, where even small functional MRI datasets typically consist of several billion observations (e.g., 100,000 points in the brain, each measured at 1,000 distinct timepoints, in each of 20 subjects). Figuring out how to store, manage, analyze, and interpret data on this scale is a monumental challenge–and one that arguably requires a healthy marriage between traditional neuroimaging and neuroscience expertise, and computational skills more commonly found in data science, statistics, or computer science departments.

In an effort to help bridge this gap, Ariel Rokem and I have spent part of our summer each of the last three years organizing a summer institute at the intersection of neuroimaging and data science. The most recent edition of the institute–Neurohackademy 2018–just wrapped up last week, so I thought this would be a good time to write up a summary of the course: what the course is about, who attended and instructed, what everyone did, and what lessons we’ve learned.

What is Neurohackademy?

Neurohackademy started its life in Summer 2016 as the somewhat more modestly-named Neurohackweek–a one-week program for 40 participants modeled on Astrohackweek, a course organized by the eScience Institute in collaboration with data science initiatives at Berkeley and NYU. The course was (and continues to be) held on the University of Washington’s beautiful campus in Seattle, where Ariel is based (I make the trip from Austin, Texas every year–which, as you can imagine, is a terrible sacrifice on my part given the two locales’ respective summer climates). The first two editions were supported by UW’s eScience Institute (and indirectly, by grants from the Moore and Sloan foundations). Thanks to generous support from the National Institute of Mental Health (NIMH), this year the course expanded to two weeks, 60 participants, and over 20 instructors (our funding continues through 2021, so there will be at least 3 more editions).

The overarching goal of the course is to give neuroimaging researchers the scientific computing and data science skills they need in order to get the most out of their data. Over the course of two weeks, we cover a variety of introductory and (occasionally) advanced topics in data science, and demonstrate how they can be productively used in a range of neuroimaging applications. The course is loosely structured into three phases (see the full schedule here): the first few days feature domain-general data science tutorials; the next few days focus on sample neuroimaging applications; and the last few days consist of a full-blown hackathon in which participants pitch potential projects, self-organize into groups, and spend their time collaboratively working on a variety of software, analysis, and documentation projects.

Who attended?

Admission to Neurohackademy 2018 was extremely competitive: we received nearly 400 applications for just 60 spots. This was a very large increase from the previous two years, presumably reflecting the longer duration of the course and/or our increased efforts to publicize it. While we were delighted by the deluge of applications, it also meant we had to be far more selective about admissions than in previous years. The highly interactive nature of the course, coupled with the high per-participant costs (we provide two weeks of accommodations and meals), makes it unlikely that Neurohackademy will grow beyond 60 participants in future editions, despite the clear demand. Our rough sense is that somewhere between half and two-thirds of all applicants were fully qualified and could have easily been admitted, so there’s no question that, for many applicants, blind luck played a large role in determining whether or not they were accepted. I mention this mainly for the benefit of people who applied for the 2018 course and didn’t make it in: don’t take it personally! There’s always next year. (And, for that matter, there are also a number of other related summer schools we encourage people to apply to, including the Methods in Neuroscience at Dartmouth Computational Summer School, Allen Institute Summer Workshop on the Dynamic Brain, Summer School in Computational Sensory-Motor Neuroscience, and many others.)

The 60 participants who ended up joining us came from a diverse range of demographic backgrounds, academic disciplines, and skill levels. Most of our participants were trainees in academic programs (40 graduate students, 12 postdocs), but we also had 2 faculty members, 6 research staff, and 2 medical residents (note that all of these counts include 4 participants who were admitted to the course but declined to, or could not, attend). We had nearly equal numbers of male and female participants (30F, 33M), and 11 participants came from traditionally underrepresented backgrounds. 43 participants were from institutions or organizations based in the United States, with the remainder coming from 14 different countries around the world.

The disciplinary backgrounds and expertise levels of participants are a bit harder to estimate for various reasons, but our sense is that the majority (perhaps two-thirds) of participants received their primary training in non-computational fields (psychology, neuroscience, etc.). This was not necessarily by design–i.e., we didn’t deliberately favor applicants from biomedical fields over applicants from computational fields–and primarily mirrored the properties of the initial applicant pool. We did impose a hard requirement that participants should have at least some prior expertise in both programming and neuroimaging, but subject to that constraint, there was enormous variation in previous experience along both dimensions–something that we see as a desirable feature of the course (more on this below).

We intend to continue to emphasize and encourage diversity at Neurohackademy, and we hope that all of our participants experienced the 2018 edition as a truly inclusive, welcoming event.

Who taught?

We were fortunate to be able to bring together more than 20 instructors with world-class expertise in a diverse range of areas related to neuroimaging and data science. “Instructor” is a fairly loose term at Neurohackademy: we deliberately try to keep the course non-hierarchical, so that for the most part, instructors are just participants who happen to fall on the high-experience tail of the experience distribution. That said, someone does have to teach the tutorials and lectures, and we were lucky to have a stellar cast of experts on hand. Many of the data science tutorials during the first phase of the course were taught by eScience staff and UW faculty kind enough to take time out of their other duties to help teach participants a range of core computing skills: Git and GitHub (Bernease Herman), R (Valentina Staneva and Tara Madhyastha), web development (Anisha Keshavan), and machine learning (Jake Vanderplas), among others.

In addition to the local instructors, we were joined for the tutorial phase by Kirstie Whitaker (Turing Institute), Chris Gorgolewski (Stanford), Satra Ghosh (MIT), and JB Poline (McGill)–all veterans of the course from previous years (Kirstie was a participant at the first edition!). We’re particularly indebted to Kirstie and Chris for their immense help. Kirstie was instrumental in helping a number of participants bridge the (large!) gap between using git privately, and using it to actively collaborate on a public project. As one of the participants elegantly put it:

Chris shouldered a herculean teaching load, covering Docker, software testing, BIDS and BIDS-Apps, and also leading an open science panel. I’m told he even sleeps on occasion.

We were also extremely lucky to have Fernando Perez (Berkeley)–the creator of IPython and leader of the Jupyter team–join us for several days; his presentation on Jupyter (videos: part 1 and part 2) was one of the highlights of the course for me personally, and I heard many other instructors and participants share the same sentiment. Jupyter was a critical part of our course infrastructure (more on that below), so it was fantastic to have Fernando join us and share his insights on the fascinating history of Jupyter, and on reproducible science more generally.

As the course went on, we transitioned from tutorials focused on core data science skills to more traditional lectures focusing on sample applications of data science methods to neuroimaging data. Instructors during this phase of the course included Tor Wager (Colorado), Eva Dyer (Georgia Tech), Gael Varoquaux (INRIA), Tara Madhyastha (UW), Sanmi Koyejo (UIUC), and Nick Cain and Justin Kiggins (Allen Institute for Brain Science). We continued to emphasize hands-on interaction with data; many of the presenters during this phase spent much of their time showing participants how to work with programmatic tools to generate the kinds of results one might find in papers they’ve authored (e.g., Tor Wager and Gael Varoquaux demonstrated tools for neuroimaging data analysis written in Matlab and Python, respectively).

The fact that so many leading experts were willing to take large chunks of time out of their schedule (most of the instructors hung around for several days, facilitating extended interactions with participants) to visit with us at Neurohackademy speaks volumes about the kind of people who make up the neuroimaging data science community. We’re tremendously grateful to these folks for their contributions, and hope they’ll return to teach at future editions of the institute.

What did we cover?

The short answer is: see for yourself! We’ve put most of the slides, code, and videos from the course online, and encourage people to interact with, learn from, and reuse these materials.

Now the long(er) answer. One of the challenges in organizing scientific training courses that focus on technical skill development is that participants almost invariably arrive with a wide range of backgrounds and expertise levels. At Neurohackademy, some of the participants were effectively interchangeable with instructors, while others were relatively new to programming and/or neuroimaging. The large variance in technical skill is a feature of the course, not a bug: while we require all admitted participants to have some prior programming background, we’ve found that having a range of skill levels is an excellent way to make sure that everyone is surrounded by people who they can alternately learn from, help out, and collaborate with.

That said, the wide range of backgrounds does present some organizational challenges: introductory sessions often bore more advanced participants, while advanced sessions tend to frustrate newcomers. To accommodate the range of skill levels, we tried to design the course in a way that benefits as many people as possible (though we don’t pretend to think it worked great for everyone). During the first two days, we featured two tracks of tutorials at most times, with simultaneously-held presentations generally differing in topic and/or difficulty (e.g., Git/GitHub opposite Docker; introduction to Python opposite introduction to R; basic data visualization opposite computer vision).

Throughout Neurohackademy, we deliberately placed heavy emphasis on the Python programming language. We think Python has a lot going for it as a lingua franca of data science and scientific computing. The language is free, performant, relatively easy to learn, and very widely used within the data science, neuroimaging, and software development communities. It also helps that many of our instructors (e.g., Fernando Perez, Jake Vanderplas, and Gael Varoquaux) are major contributors to the scientific Python ecosystem, so there was a very high concentration of local Python expertise to draw on. That said, while most of our instruction was done in Python, we were careful to emphasize that participants were free to work in whatever language(s) they like. We deliberately include tutorials and lectures that featured R, Matlab, or JavaScript, and a number of participant projects (see below) were written partly or entirely in other languages, including R, Matlab, JavaScript, and C.

We’ve also found that the tooling we provide to participants matters–a lot. A robust, common computing platform can spell the difference between endless installation problems that eat into valuable course time, and a nearly seamless experience that participants can dive into right away. At Neurohackademy, we made extensive use of the Jupyter suite of tools for interactive computing. In particular, thanks to Ariel’s heroic efforts (which built on some very helpful docs, similarly heroic efforts by Chris Holdgraf, Yuvi Panda, and Satra Ghosh last year), we were able to conduct a huge portion of our instruction and collaborative hacking using a course-wide Jupyter Hub allocation, deployed via Kubernetes, running on the Google Cloud. This setup allowed Ariel to create a common web-accessible environment for all course participants, so that, at the push of a button, each participant was dropped into a Jupyter Lab environment containing many of the software dependencies, notebooks, and datasets we used throughout the course. While we did run into occasional scaling bottlenecks (usually when an instructor demoed a computationally intensive method, prompting dozens of people to launch the same process in their pods), for the most part, our participants were able to drop into a running JupyterLab instance within seconds and immediately start interactively playing with the code being presented by instructors.

Surprisingly (at least to us), our total Google Cloud computing costs for the entire two-week, 60-participant course came to just $425. Obviously, that number could have easily skyrocketed had we scaled up our allocation dramatically and allowed our participants to execute arbitrarily large jobs (e.g., preprocessing data from all ~1,200 HCP subjects). But we thought the limits we imposed were pretty reasonable, and our experience suggests that not only is Jupyter Hub an excellent platform from a pedagogical standpoint, but it can also be an extremely cost-effective one.

What did we produce?

Had Neurohackademy produced nothing at all besides the tutorials, slides, and videos generated by instructors, I think it’s fair to say that participants would still have come away feeling that they learned a lot (more on that below). But a major focus of the institute was on actively hacking on the brain–or at least, on data related to the brain. To this effect, the last 3.5 days of the course were dedicated exclusively to a full-blown hackathon in which participants pitched potential projects, self-organized into groups, and then spent their time collaboratively working on a variety of software, analysis, and documentation projects. You can find a list of most of the projects on the course projects repository (most link out to additional code or resources).

As one might expect given the large variation in participant experience, project group size, and time investment (some people stuck to one project for all three days, while others moved around), the scope of projects varied widely. From our perspective–and we tried to emphasize this point throughout the hackathon–the important thing was not what participants’ final product looked like, but how much they learned along the way. There’s always a tension between exploitation and exploration at hackathons, with some people choosing to spend most of their time expanding on existing projects using technologies they’re already familiar with, and others deciding to start something completely new, or to try out a new language–and then having to grapple with the attendant learning curve. While some of the projects were based on packages that predated Neurohackademy, most participants ended up working on projects they came up with de novo at the institute, often based on tools or resources they first learned about during the course. I’ll highlight just three projects here that provide a representative cross-section of the range of things people worked on:

1. Peer Herholz and Rita Ludwig created a new BIDS-app called Bidsonym for automated de-identification of neuroimaging data. The app is available from Docker Hub, and features not one, not two, but three different de-identification algorithms. If you want to shave the faces off of your MRI participants with minimal fuss, make friends with Bidsonym.

2. A group of eight participants ambitiously set out to develop a new “O-Factor” metric intended to serve as a relative measure of the openness of articles published in different neuroscience-related journals. The project involved a variety of very different tasks, including scraping (public) data from the PubMed Central API, computing new metrics of code and data sharing, and interactively visualizing the results using a d3 dashboard. While the group was quick to note that their work is preliminary, and has a bunch of current limitations, the results look pretty great–though some disappointment was (facetiously) expressed during the project presentations that the journal Nature is not, as some might have imagined, a safe house where scientific datasets can hide from the prying public.

3. Emily Wood, Rebecca Martin, and Rosa Li worked on tools to facilitate mixed-model analysis of fMRI data using R. Following a talk by Tara Madhyastha  on her Neuropointillist R framework for fMRI data analysis, the group decided to create a new series of fully reproducible Markdown-based tutorials for the package (the original documentation was based on non-public datasets). The group expanded on the existing installation instructions (discovering some problems in the process), created several tutorials and examples, and also ended up patching the neuropointillist code to work around a very heavy dependency (FSL).

You can read more about these 3 projects and 14 others on the project repository, and in some cases, you can even start using the tools right away in your own work. Or you could just click through and stare at some of the lovely images participants produced.

So, how did it go?

It went great!

Admittedly, Ariel and I aren’t exactly impartial parties–we wouldn’t keep doing this if we didn’t think participants get a lot out of it. But our assessment isn’t based just on our personal impressions; we have participants fill out a detailed (and anonymous) survey every year, and go out of our way to encourage additional constructive criticism from the participants (which a majority provide). So I don’t think we’re being hyperbolic when we say that most people who participated in the course had an extremely educational and enjoyable experience. Exhibit A is this set of unsolicited public testimonials, courtesy of twitter:

The organizers and instructors all worked hard to build an event that would bring people together as a collaborative and productive (if temporary) community, and it’s very gratifying to see those goals reflected in participants’ experiences.

Of course, that’s not to say there weren’t things we could do better; there were plenty, and we’ve already made plans to adjust and improve the course next year based on feedback we received. For example, some suggestions we received from multiple participants included adding more ice-breaking activities early on in the course; reducing the intensity of the tutorial/lecture schedule the first week (we went 9 am to 6 pm every day, stopping only for an hourlong lunch and a few short breaks); and adding designated periods for interaction with instructors and other participants. We’ve already made plans to address these (and several other) recommendations in next year’s edition, and expect it to looks slightly different from (and hopefully better than!) Neurohackademy 2018.

Thank you!

I think that’s a reasonable summary of what went on at Neurohackademy 2018. We’re delighted at how the event turned out, and are happy to answer questions (feel free to leave them in the comments below, or to email Ariel and/or me).

We’d like to end by thanking all of the people and organizations who helped make Neurohackademy 2018 a success: NIMH for providing the funding that makes Neurohackademy possible; the eScience Institute and staff for throwing their wholehearted support behind the course (particularly our awesome course coordinator, Rachael Murray); and the many instructors who each generously took several days (and in a few cases, more than a week!) out of their schedule, unpaid, to come to Seattle and share their knowledge with a bunch of enthusiastic strangers. On a personal note, I’d also like to thank Ariel, who did the lion’s share of the actual course directing. I mostly just get to show up in Seattle, teach some stuff, hang out with great people, and write a blog post about it.

Lastly, and above all else, we’d like to thank our participants. It’s a huge source of inspiration and joy to us each year to see what a group of bright, enthusiastic, motivated researchers can achieve when given time, space, and freedom (and, okay, maybe also a large dollop of cloud computing credits). We’re looking forward to at least three more years of collaborative, productive neurohacking!

Yes, your research is very noble. No, that’s not a reason to flout copyright law.

Scientific research is cumulative; many elements of a typical research project would not and could not exist but for the efforts of many previous researchers. This goes not only for knowledge, but also for measurement. In much of the clinical world–and also in many areas of “basic” social and life science research–people routinely save themselves inordinate amounts of work by using behavioral or self-report measures developed and validated by other researchers.

Among many researchers who work in fields heavily dependent on self-report instruments (e.g., personality psychology), there appears to be a tacit belief that, once a measure is publicly available–either because it’s reported in full in a journal article, or because all of the items and instructions be found on the web–it’s fair game for use in subsequent research. There’s a time-honored ttradition of asking one’s colleagues if they happen to “have a copy” of the NEO-PI-3, or the Narcissistic Personality Inventory, or the Hamilton Depression Rating Scale. The fact that many such measures are technically published under restrictive copyright licenses, and are often listed for sale at rather exorbitant prices (e.g., you can buy 25 paper copies of the NEO-PI-3 from the publisher for $363 US), does not seem to deter researchers much. The general understanding seems to be that if a measure is publicly available, it’s okay to use it for research purposes. I don’t think most researchers have a well-thought out, internally consistent justification for this behavior; it seems to almost invariably be an article of tacit belief that nothing bad can or should happen to someone who uses a commercially available instrument for a purpose as noble as scientific research.

The trouble with tacit beliefs is that, like all beliefs, they can sometimes be wrong–only, because they’re tacit, they’re often not evaluated openly until things go horribly wrong. Exhibit A on the frontier of horrible wrongness is a recent news article in Science that reports on a rather disconcerting case where the author of a measure (the Eight-Item Morisky Medication Adherence Scale–which also provides a clue to its author’s name) has been demanding rather large sums of money (ranging from $2000 to $6500) from the authors of hundreds of published articles that have used the MMAS-8 without explicitly requesting permission. As the article notes, there appears to be a general agreement that Morisky is within his legal rights to demand such payment; what people seem to be objecting to is the amount Morisky is requesting, and the way he’s going about the process (i.e., with lawyers):

Morisky is well within his rights to seek payment for use of his copyrighted tool. U.S. law encourages academic scientists and their universities to protect and profit from their inventions, including those developed with public funds. But observers say Morisky’s vigorous enforcement and the size of his demands stand out. “It’s unusual that he is charging as much as he is,” says Kurt Geisinger, director of the Buros Center for Testing at the University of Nebraska in Lincoln, which evaluates many kinds of research-related tests. He and others note that many scientists routinely waive payments for such tools, as long as they are used for research.

It’s a nice article, and and I think it suggests two things fairly clearly. First, Morisky is probably not a very nice man. He seems to have no compunction charging resource-strapped researchers in third-world countries licensing fees that require them to take out loans from their home universities, and he would apparently rather see dozens of published articles retracted from the literature than suffer the indignity of having someone use his measure without going through the proper channels (and paying the corresponding fees).

Second, the normative practice in many areas of science that depend on the (re)use of measures developed by other people is to essentially flout copyright law, bury one’s head in the sand, and hope for the best.

I don’t know that anything can be done about the first observation–and even if something could be done, there will always be other Moriskys. I do, however, think that we could collectively do quite a few things to change the way scientists think about, and deal with, the re-use of self-report (and other kinds of) measures. Most of these amount to providing better guidance and training. In principle, this shouldn’t be hard to do; in most disciplines, scientists are trained in all manner of research method, statistical praxis, and scientific convention. Yet I know of no graduate program in my own discipline (psychology) that provides its students with even a cursory overview of intellectual property law. This despite the fact that many scientists’ chief assets–and the things they most closely identify their career achievements with–are their intellectual products.

This is, in my view, a serious training failure. More important, it’s an unnecessary failure, because there isn’t really very much that a social scientist needs to know about copyright law in order to dramatically reduce their odds of ending up a target of legal action. The goal is not to train PhDs who can moonlight as bad attorneys; it’s to prevent behavior that flagrantly exposes one to potential Moriskying (look! I coined a verb!). For that, a single 15-minute segment of a research methods class would likely suffice. While I’m sure someone better-informed and more lawyer-like than me could come up with a more accurate precis, here’s the gist of what I think one would want to cover:

  • Just because a measure is publicly available does not mean it’s in the public domain. It’s intuitive to suppose that any measure that can be found in a publicly accessible place (e.g., on the web) is, by default, okay for public use–meaning that, unless the author of a measure has indicated that they don’t want their measure to be used by others, it can be. In fact, the opposite is true. By default, the author of a newly produced work retains all usage and distribution rights to that work. The author can, if they are so inclined, immediately place that work in the public domain. Alternatively, they could stipulate that every time someone uses their measure, that user must, within 72 hours of use, immediately send the author 22 green jelly beans in an unmarked paper bag. You don’t like those terms of use? Fine: don’t use the measure.

Importantly, an author isn’t under any obligation to say anything at all about how they wish their work to be reproduced or used. This means that when a researcher uses a measure that lacks explicit licensing information, that researcher is assuming the risk of running afoul of the measure author’s desires, whether or not those desires have been made publicly known. The fact that the measure happens to be publicly available may be a mitigating factor (e.g., one could potentially claim fair use, though as far as I know there’s little precedent for this type of thing in the scientific domain), but that’s a matter for lawyers to hash out, and I think most of us scientists would rather avoid lawyer-hashing if we can help it.

This takes us directly to the next point…

  • Don’t use a measure unless you’ve read, and agree with, its licensing terms. Of course, in practice, very few scientific measures are currently released with an explicit license–which gives rise to an important corollary injunction: don’t use a measure that doesn’t come with a license.

The latter statement may seem unfair; after all, it’s clear enough that most measures developed by social scientist are missing licenses not because their authors are intentionally trying to capitalize on ambiguity, but simply because most authors are ignorant of the fact that the lack of a license creates a significant liability for potential users. Walking away from unlicensed measures would amount to giving up on huge swaths of potential research, which surely doesn’t seem like a good idea.

Fortunately, I’m not suggesting anything nearly this drastic. Because the lack of licensing is typically unintentional, often, a simple, friendly email to an author may be sufficient to magic an explicit license into existence. While I haven’t had occasion to try this yet for self-report measures, I’ve been on both ends of such requests on multiple occasions when dealing with open-source software. In virtually every case I’ve been involved in, the response to an inquiry along the lines of “hey, I’d like to use your software, but there’s no license information attached” has been to either add a license to the repository (for example…), or provide an explicit statement to the effect of “you’re welcome to use this for the use case you describe”. Of course, if a response is not forthcoming, that too is instructive, as it suggests that perhaps steering clear of the tool (or measure) in question might be a good idea.

Of course, taking licensing seriously requires one to abide by copyright law–which, like it or not, means that there may be cases where the responsible (and legal) thing to do is to just walk away from a measure, even if it seems perfect for your use case from a research standpoint. If you’re serious about taking copyright seriously, and, upon emailing the author to inquire about the terms of use, you’re informed that the terms of use involve paying $100 per participant, you can either put up the money, or use a different measure. Burying your head in the sand and using the measure anyway, without paying for it, is not a good look.

  • Attach a license to every reusable product you release into the wild. This follows directly from the previous point: if you want responsible, informed users to feel comfortable using your measure, you should tell them what they can and can’t do with it. If you’re so inclined, you can of course write your own custom license, which can involve dollar bills, jelly beans, or anything else your heart desires. But unless you feel a strong need to depart from existing practices, it’s generally a good idea to select one of the many pre-existing licenses out there, because most of them have the helpful property of having been written by lawyers, and lawyers are people who generally know how to formulate sentiments like “you must give me heap big credit” in somewhat more precise language.

There are a lot of practical recommendations out there about what license one should or shouldn’t choose; I won’t get into those here, except to say that in general, I’m a strong proponent of using permissive licenses (e.g., MIT or CC-BY), and also, that I agree with many people’s sentiment that placing restrictions on commercial use–while intuitively appealing to scientists who value public goods–is generally counterproductive. In any case, the real point here is not to push people to use any particular license, but just to think about it for a few minutes when releasing a measure. I mean, you’re probably going to spend tens or hundreds of hours thinking about the measure itself; the least you can do is make sure you tell people what they’re allowed to do with it.

I think covering just the above three points in the context of a graduate research methods class–or at the very least, in those methods classes slanted towards measure development or evaluation (e.g., psychometrics)–would go a long way towards changing scientific norms surrounding measure use.

Most importantly, perhaps, the point of learning a little bit about copyright law is not just to reduce one’s exposure to legal action. There are also large communal benefits. If academic researchers collectively decided to stop flouting copyright law when choosing research measures, the developers of measures would face a very different–and, from a societal standpoint, much more favorable–set of incentives. The present state of affairs–where an instrument’s author is able to legally charge well-meaning researchers exorbitant fees post-hoc for use of an 8-item scale–exists largely because researchers refuse to take copyright seriously, and insist on acting as if science, being such a noble and humanitarian enterprise, is somehow exempt from legal considerations that people in other fields have to constantly worry about. Perversely, the few researchers who do the right thing by offering to pay for the scales they use then end up incurring large costs, while the majority who use the measures without permission suffer no consequences (except on the rare occasions when someone like Morisky comes knocking on the door with a lawyer).

By contrast, in an academic world that cared more about copyright law, many widely-used measures that are currently released under ambiguous or restrictive licenses (or, most commonly, no license at all) would never have attained widespread use in the first place. If, say, Costa & McCrae’s NEO measures–used by thousands of researchers every year–had been developed in a world where academics had a standing norm of avoiding restrictively licensed measures, the most likely outcome is that the NEO would have changed to accommodate the norm, and not vice versa. The net result is that we would be living in a world where the vast majority of measures–just like the vast majority of open-source software–really would be free to use in every sense of the word, without risk of lawsuits, and with the ability to redistribute, reuse, and modify freely. That, I think, is a world we should want to live in. And while the ship may have already sailed when it comes to the most widely used existing measures, it’s a world we could still have going forward. We just have to commit to not using new measures unless they have a clear license–and be prepared to follow the terms of that license to the letter.

whether or not you should pursue a career in science still depends mostly on that thing that is you

I took the plunge a couple of days ago and answered my first question on Quora. Since Brad Voytek won’t shut up about how great Quora is, I figured I should give it a whirl. So far, Brad is not wrong.

The question in question is: “How much do you agree with Johnathan Katz’s advice on (not) choosing science as a career? Or how realistic is it today (the article was written in 1999)?” The Katz piece referred to is here. The gist of it should be familiar to many academics; the argument boils down to the observation that relatively few people who start graduate programs in science actually end up with permanent research positions, and even then, the need to obtain funding often crowds out the time one has to do actual science. Katz’s advice is basically: don’t pursue a career in science. It’s not an optimistic piece.

My answer is, I think, somewhat more optimistic. Here’s the full text:

The real question is what you think it means to be a scientist. Science differs from many other professions in that the typical process of training as a scientist–i.e., getting a Ph.D. in a scientific field from a major research university–doesn’t guarantee you a position among the ranks of the people who are training you. In fact, it doesn’t come close to guaranteeing it; the proportion of PhD graduates in science who go on to obtain tenure-track positions at research-intensive universities is very small–around 10% in most recent estimates. So there is a very real sense in which modern academic science is a bit of a pyramid scheme: there are a relatively small number of people at the top, and a lot of people on the rungs below laboring to get up to the top–most of whom will, by definition, fail to get there.

If you equate a career in science solely with a tenure-track position at a major research university, and are considering the prospect of a Ph.D. in science solely as an investment intended to secure that kind of position, then Katz’s conclusion is difficult to escape. He is, in most respects, correct: in most biomedical, social, and natural science fields, science is now an extremely competitive enterprise. Not everyone makes it through the PhD; of those who do, not everyone makes it into–and then through–one more more postdocs; and of those who do that, relatively few secure tenure-track positions. Then, of those few “lucky” ones, some will fail to get tenure, and many others will find themselves spending much or most of their time writing grants and managing people instead of actually doing science. So from that perspective, Katz is probably right: if what you mean when you say you want to become a scientist is that you want to run your own lab at a major research university, then your odds of achieving that at the outset are probably not very good (though, to be clear, they’re still undoubtedly better than your odds of becoming a successful artist, musician, or professional athlete). Unless you have really, really good reasons to think that you’re particularly brilliant, hard-working, and creative (note: undergraduate grades, casual feedback from family and friends, and your own internal gut sense do not qualify as really, really good reasons), you probably should not pursue a career in science.

But that’s only true given a rather narrow conception where your pursuit of a scientific career is motivated entirely by the end goal rather than by the process, and where failure is anything other than ending up with a permanent tenure-track position. By contrast, if what you’re really after is an environment in which you can pursue interesting questions in a rigorous way, surrounded by brilliant minds who share your interests, and with more freedom than you might find at a typical 9 to 5 job, the dream of being a scientist is certainly still alive, and is worth pursuing. The trivial demonstration of this is that if you’re one of the many people who actuallyenjoy the graduate school environment (yes, they do exist!), it may not even matter to you that much whether or not you have a good shot of getting a tenure-track position when you graduate.

To see this, imagine that you’ve just graduated with an undergraduate degree in science, and someone offers you a choice between two positions for the next six years. One position is (relatively) financially secure, but involves rather boring work of quesitonable utility to society, an inflexible schedule, and colleagues who are mostly only there for a paycheck. The other position has terrible pay, but offers fascinating and potentially important work, a flexible lifestyle, and colleagues who are there because they share your interests and want to do scientific research.

Admittedly, real-world choices are rarely this stark. Many non-academic jobs offer many of the same perceived benefits of academia (e.g., many tech jobs offer excellent working conditions, flexible schedules, and important work). Conversely, many academic environments don’t quite live up to the ideal of a place where you can go to pursue your intellectual passion unfettered by the annoyances of “real” jobs–there’s often just as much in the way of political intrigue, personality dysfunction, and menial due-paying duties. But to a first approximation, this is basically the choice you have when considering whether to go to graduate school in science or pursue some other career: you’re trading financial security and a fixed 40-hour work week against intellectual engagement and a flexible lifestyle. And the point to note is that, even if we completely ignore what happens after the six years of grad school are up, there is clearly a non-negligible segment of the population who would quite happy opt for the second choice–even recognizing full well that at the end of six years they may have to leave and move onto something else, with little to show for their effort. (Of course, in reality we don’t need to ignore what happens after six years, because many PhDs who don’t get tenure-track positions find rewarding careers in other fields–many of them scientific in nature. And, even though it may not be a great economic investment, having a Ph.D. in science is a great thing to be able to put on one’s resume when applying for a very broad range of non-academic positions.)

The bottom line is that whether or not you should pursue a career in science has as much or more to do with your goals and personality as it does with the current environment within or outside of (academic) science. In an ideal world (which is certainly what the 1970s as described by Katz sound like, though I wasn’t around then), it wouldn’t matter: if you had any inkling that you wanted to do science for a living, you would simply go to grad school in science, and everything would probably work itself out. But given real-world constraints, it’s absolutely essentially that you think very carefully about what kind of environment makes you happy and what your expectations and goals for the future are. You have to ask yourself: Am I the kind of person who values intellectual freedom more than financial security? Do I really love the process of actually doing science–not some idealized movie version of it, but the actual messy process–enough to warrant investing a huge amount of my time and energy over the next few years? Can I deal with perpetual uncertainty about my future? And ultimately, would I be okay doing something that I really enjoy for six years if at the end of that time I have to walk away and do something very different?

If the answer to all of these questions is yes–and for many people it is!–then pursuing a career in science is still a very good thing to do (and hey, you can always quit early if you don’t like it–then you’ve lost very little time!). If the answer to any of them is no, then Katz may be right. A prospective career in science may or may not be for you, but at the very least, you should carefully consider alternative prospects. There’s absolutely no shame in going either route; the important thing is just to make an honest decision that takes the facts as they are and not as you wish that they were.

A couple of other thoughts I’ll add belatedly:

  • Calling academia a pyramid scheme is admittedly a bit hyperbolic. It’s true that the personnel structure in academia broadly has the shape of a pyramid, but that’s true of most organizations in most other domains too. Pyramid schemes are typically built on promises and lies that (almost by definition) can’t be realized, and I don’t think many people who enter a Ph.D. program in science can claim with a straight face that they were guaranteed a permanent research position at the end of the road (or that it’s impossible to get such a position). As I suggested in this post, it’s much more likely that everyone involved is simply guilty of minor (self-)deception: faculty don’t go out of their way to tell prospective students what the odds are of actually getting a tenure-track position, and prospective grad students don’t work very hard to find out the painful truth, or to tell faculty what their real intentions are after they graduate. And it may actually be better for everyone that way.
  • Just in case it’s not clear from the above, I’m not in any way condoning the historically low levels of science funding, or the fact that very few science PhDs go on to careers in academic research. I would love for NIH and NSF budgets (or whatever your local agency is) to grow substantially–and for everyone get exactly the kind of job they want, academic or not. But that’s not the world we live in, so we may as well be pragmatic about it and try to identify the conditions under which it does or doesn’t make sense to pursue a career in science right now.
  • I briefly mention this above, but it’s probably worth stressing that there are many jobs outside of academia that still allow one to do scientific research, albeit typically with less freedom (but often for better hours and pay). In particular, the market for data scientists is booming right now, and many of the hires are coming directly from academia. One lesson to take away from this is: if you’re in a science Ph.D. program right now, you should really spend as much time as you can building up your quantitative and technical skills, because they could very well be the difference between a job that involves scientific research and one that doesn’t in the event you leave academia. And those skills will still serve you well in your research career even if you end up staying in academia.

 

the truth is not optional: five bad reasons (and one mediocre one) for defending the status quo

You could be forgiven for thinking that academic psychologists have all suddenly turned into professional whistleblowers. Everywhere you look, interesting new papers are cropping up purporting to describe this or that common-yet-shady methodological practice, and telling us what we can collectively do to solve the problem and improve the quality of the published literature. In just the last year or so, Uri Simonsohn introduced new techniques for detecting fraud, and used those tools to identify at least 3 cases of high-profile, unabashed data forgery. Simmons and colleagues reported simulations demonstrating that standard exploitation of research degrees of freedom in analysis can produce extremely high rates of false positive findings. Pashler and colleagues developed a “Psych file drawer” repository for tracking replication attempts. Several researchers raised trenchant questions about the veracity and/or magnitude of many high-profile psychological findings such as John Bargh’s famous social priming effects. Wicherts and colleagues showed that authors of psychology articles who are less willing to share their data upon request are more likely to make basic statistical errors in their papers. And so on and so forth. The flood shows no signs of abating; just last week, the APS journal Perspectives in Psychological Science announced that it’s introducing a new “Registered Replication Report” section that will commit to publishing pre-registered high-quality replication attempts, irrespective of their outcome.

Personally, I think these are all very welcome developments for psychological science. They’re solid indications that we psychologists are going to be able to police ourselves successfully in the face of some pretty serious problems, and they bode well for the long-term health of our discipline. My sense is that the majority of other researchers–perhaps the vast majority–share this sentiment. Still, as with any zeitgeist shift, there are always naysayers. In discussing these various developments and initiatives with other people, I’ve found myself arguing, with somewhat surprising frequency, with people who for various reasons think it’s not such a good thing that Uri Simonsohn is trying to catch fraudsters, or that social priming findings are being questioned, or that the consequences of flexible analyses are being exposed. Since many of the arguments I’ve come across tend to recur, I thought I’d summarize the most common ones here–along with the rebuttals I usually offer for why, with one possible exception, the arguments for giving a pass to sloppy-but-common methodological practices are not very compelling.

“But everyone does it, so how bad can it be?”

We typically assume that long-standing conventions must exist for some good reason, so when someone raises doubts about some widespread practice, it’s quite natural to question the person raising the doubts rather than the practice itself. Could it really, truly be (we say) that there’s something deeply strange and misguided about using p values? Is it really possible that the reporting practices converged on by thousands of researchers in tens of thousands of neuroimaging articles might leave something to be desired? Could failing to correct for the many researcher degrees of freedom associated with most datasets really inflate the false positive rate so dramatically?

The answer to all these questions, of course, is yes–or at least, we should allow that it could be yes. It is, in principle, entirely possible for an entire scientific field to regularly do things in a way that isn’t very good. There are domains where appeals to convention or consensus make perfect sense, because there are few good reasons to do things a certain way except inasmuch as other people do them the same way. If everyone else in your country drives on the right side of the road, you may want to consider driving on the right side of the road too. But science is not one of those domains. In science, there is no intrinsic benefit to doing things just for the sake of convention. In fact, almost by definition, major scientific advances are ones that tend to buck convention and suggest things that other researchers may not have considered possible or likely.

In the context of common methodological practice, it’s no defense at all to say but everyone does it this way, because there are usually relatively objective standards by which we can gauge the quality of our methods, and it’s readily apparent that there are many cases where the consensus approach leave something to be desired. For instance, you can’t really justify failing to correct for multiple comparisons when you report a single test that’s just barely significant at p < .05 on the grounds that nobody else corrects for multiple comparisons in your field. That may be a valid explanation for why your paper successfully got published (i.e., reviewers didn’t want to hold your feet to the fire for something they themselves are guilty of in their own work), but it’s not a valid defense of the actual science. If you run a t-test on randomly generated data 20 times, you will, on average, get a significant result, p < .05, once. It does no one any good to argue that because the convention in a field is to allow multiple testing–or to ignore statistical power, or to report only p values and not effect sizes, or to omit mention of conditions that didn’t ‘work’, and so on–it’s okay to ignore the issue. There’s a perfectly reasonable question as to whether it’s a smart career move to start imposing methodological rigor on your work unilaterally (see below), but there’s no question that the mere presence of consensus or convention surrounding a methodological practice does not make that practice okay from a scientific standpoint.

“But psychology would break if we could only report results that were truly predicted a priori!”

This is a defense that has some plausibility at first blush. It’s certainly true that if you force researchers to correct for multiple comparisons properly, and report the many analyses they actually conducted–and not just those that “worked”–a lot of stuff that used to get through the filter will now get caught in the net. So, by definition, it would be harder to detect unexpected effects in one’s data–even when those unexpected effects are, in some sense, ‘real’. But the important thing to keep in mind is that raising the bar for what constitutes a believable finding doesn’t actually prevent researchers from discovering unexpected new effects; all it means is that it becomes harder to report post-hoc results as pre-hoc results. It’s not at all clear why forcing researchers to put in more effort validating their own unexpected finding is a bad thing.

In fact, forcing researchers to go the extra mile in this way would have one exceedingly important benefit for the field as a whole: it would shift the onus of determining whether an unexpected result is plausible enough to warrant pursuing away from the community as a whole, and towards the individual researcher who discovered the result in the first place. As it stands right now, if I discover an unexpected result (p < .05!) that I can make up a compelling story for, there’s a reasonable chance I might be able to get that single result into a short paper in, say, Psychological Science. And reap all the benefits that attend getting a paper into a “high-impact” journal. So in practice there’s very little penalty to publishing questionable results, even if I myself am not entirely (or even mostly) convinced that those results are reliable. This state of affairs is, to put it mildly, not A Good Thing.

In contrast, if you as an editor or reviewer start insisting that I run another study that directly tests and replicates my unexpected finding before you’re willing to publish my result, I now actually have something at stake. Because it takes time and money to run new studies, I’m probably not going to bother to follow up on my unexpected finding unless I really believe it. Which is exactly as it should be: I’m the guy who discovered the effect, and I know about all the corners I have or haven’t cut in order to produce it; so if anyone should make the decision about whether to spend more taxpayer money chasing the result, it should be me. You, as the reviewer, are not in a great position to know how plausible the effect truly is, because you have no idea how many different types of analyses I attempted before I got something to ‘work’, or how many failed studies I ran that I didn’t tell you about. Given the huge asymmetry in information, it seems perfectly reasonable for reviewers to say, You think you have a really cool and unexpected effect that you found a compelling story for? Great; go and directly replicate it yourself and then we’ll talk.

“But mistakes happen, and people could get falsely accused!”

Some people don’t like the idea of a guy like Simonsohn running around and busting people’s data fabrication operations for the simple reason that they worry that the kind of approach Simonsohn used to detect fraud is just not that well-tested, and that if we’re not careful, innocent people could get swept up in the net. I think this concern stems from fundamentally good intentions, but once again, I think it’s also misguided.

For one thing, it’s important to note that, despite all the press, Simonsohn hasn’t actually done anything qualitatively different from what other whistleblowers or skeptics have done in the past. He may have suggested new techniques that improve the efficiency with which cheating can be detected, but it’s not as though he invented the ability to report or investigate other researchers for suspected misconduct. Researchers suspicious of other researchers’ findings have always used qualitatively similar arguments to raise concerns. They’ve said things like, hey, look, this is a pattern of data that just couldn’t arise by chance, or, the numbers are too similar across different conditions.

More to the point, perhaps, no one is seriously suggesting that independent observers shouldn’t be allowed to raise their concerns about possible misconduct with journal editors, professional organizations, and universities. There really isn’t any viable alternative. Naysayers who worry that innocent people might end up ensnared by false accusations presumably aren’t suggesting that we do away with all of the existing mechanisms for ensuring accountability; but since the role of people like Simonsohn is only to raise suspicion and provide evidence (and not to do the actual investigating or firing), it’s clear that there’s no way to regulate this type of behavior even if we wanted to (which I would argue we don’t). If I wanted to spend the rest of my life scanning the statistical minutiae of psychology articles for evidence of misconduct and reporting it to the appropriate authorities (and I can assure you that I most certainly don’t), there would be nothing anyone could do to stop me, nor should there be. Remember that accusing someone of misconduct is something anyone can do, but establishing that misconduct has actually occurred is a serious task that requires careful internal investigation. No one–certainly not Simonsohn–is suggesting that a routine statistical test should be all it takes to end someone’s career. In fact, Simonsohn himself has noted that he identified a 4th case of likely fraud that he dutifully reported to the appropriate authorities only to be met with complete silence. Given all the incentives universities and journals have to look the other way when accusations of fraud are made, I suspect we should be much more concerned about the false negative rate than the false positive rate when it comes to fraud.

“But it hurts the public’s perception of our field!”

Sometimes people argue that even if the field does have some serious methodological problems, we still shouldn’t discuss them publicly, because doing so is likely to instill a somewhat negative view of psychological research in the public at large. The unspoken implication being that, if the public starts to lose confidence in psychology, fewer students will enroll in psychology courses, fewer faculty positions will be created to teach students, and grant funding to psychologists will decrease. So, by airing our dirty laundry in public, we’re only hurting ourselves. I had an email exchange with a well-known researcher to exactly this effect a few years back in the aftermath of the Vul et al “voodoo correlations” paper–a paper I commented on to the effect that the problem was even worse than suggested. The argument my correspondent raised was, in effect, that we (i.e., neuroimaging researchers) are all at the mercy of agencies like NIH to keep us employed, and if it starts to look like we’re clowning around, the unemployment rate for people with PhDs in cognitive neuroscience might start to rise precipitously.

While I obviously wouldn’t want anyone to lose their job or their funding solely because of a change in public perception, I can’t say I’m very sympathetic to this kind of argument. The problem is that it places short-term preservation of the status quo above both the long-term health of the field and the public’s interest. For one thing, I think you have to be quite optimistic to believe that some of the questionable methodological practices that are relatively widespread in psychology (data snooping, selective reporting, etc.) are going to sort themselves out naturally if we just look the other way and let nature run its course. The obvious reason for skepticism in this regard is that many of the same criticisms have been around for decades, and it’s not clear that anything much has improved. Maybe the best example of this is Gigerenzer and Sedlmeier’s 1989 paper entitled “Do studies of statistical power have an effect on the power of studies?“, in which the authors convincingly showed that despite three decades of work by luminaries like Jacob Cohen advocating power analyses, statistical power had not risen appreciably in psychology studies. The presence of such unwelcome demonstrations suggests that sweeping our problems under the rug in the hopes that someone (the mice?) will unobtrusively take care of them for us is wishful thinking.

In any case, even if problems did tend to solve themselves when hidden away from the prying eyes of the media and public, the bigger problem with what we might call the “saving face” defense is that it is, fundamentally, an abuse of taxypayers’ trust. As with so many other things, Richard Feynman summed up the issue eloquently in his famous Cargo Cult science commencement speech:

For example, I was a little surprised when I was talking to a friend who was going to go on the radio. He does work on cosmology and astronomy, and he wondered how he would explain what the applications of this work were. “Well,” I said, “there aren’t any.” He said, “Yes, but then we won’t get support for more research of this kind.” I think that’s kind of dishonest. If you’re representing yourself as a scientist, then you should explain to the layman what you’re doing–and if they don’t want to support you under those circumstances, then that’s their decision.

The fact of the matter is that our livelihoods as researchers depend directly on the goodwill of the public. And the taxpayers are not funding our research so that we can “discover” interesting-sounding but ultimately unreplicable effects. They’re funding our research so that we can learn more about the human mind and hopefully be able to fix it when it breaks. If a large part of the profession is routinely employing practices that are at odds with those goals, it’s not clear why taxpayers should be footing the bill. From this perspective, it might actually be a good thing for the field to revise its standards, even if (in the worst-case scenario) that causes a short-term contraction in employment.

“But unreliable effects will just fail to replicate, so what’s the big deal?”

This is a surprisingly common defense of sloppy methodology, maybe the single most common one. It’s also an enormous cop-out, since it pre-empts the need to think seriously about what you’re doing in the short term. The idea is that, since no single study is definitive, and a consensus about the reality or magnitude of most effects usually doesn’t develop until many studies have been conducted, it’s reasonable to impose a fairly low bar on initial reports and then wait and see what happens in subsequent replication efforts.

I think this is a nice ideal, but things just don’t seem to work out that way in practice. For one thing, there doesn’t seem to be much of a penalty for publishing high-profile results that later fail to replicate. The reason, I suspect, is that we incline to give researchers the benefit of the doubt: surely (we say to ourselves), Jane Doe did her best, and we like Jane, so why should we question the work she produces? If we’re really so skeptical about her findings, shouldn’t we go replicate them ourselves, or wait for someone else to do it?

While this seems like an agreeable and fair-minded attitude, it isn’t actually a terribly good way to look at things. Granted, if you really did put in your best effort–dotted all your i’s and crossed all your t’s–and still ended up reporting a false result, we shouldn’t punish you for it. I don’t think anyone is seriously suggesting that researchers who inadvertently publish false findings should be ostracized or shunned. On the other hand, it’s not clear why we should continue to celebrate scientists who ‘discover’ interesting effects that later turn out not to replicate. If someone builds a career on the discovery of one or more seemingly important findings, and those findings later turn out to be wrong, the appropriate attitude is to update our beliefs about the merit of that person’s work. As it stands, we rarely seem to do this.

In any case, the bigger problem with appeals to replication is that the delay between initial publication of an exciting finding and subsequent consensus disconfirmation can be very long, and often spans entire careers. Waiting decades for history to prove an influential idea wrong is a very bad idea if the available alternative is to nip the idea in the bud by requiring stronger evidence up front.

There are many notable examples of this in the literature. A well-publicized recent one is John Bargh’s work on the motor effects of priming people with elderly stereotypes–namely, that priming people with words related to old age makes them walk away from the experiment more slowly. Bargh’s original paper was published in 1996, and according to Google Scholar, has now been cited over 2,000 times. It has undoubtedly been hugely influential in directing many psychologists’ research programs in certain directions (in many cases, in directions that are equally counterintuitive and also now seem open to question). And yet it’s taken over 15 years for a consensus to develop that the original effect is at the very least much smaller in magnitude than originally reported, and potentially so small as to be, for all intents and purposes, “not real”. I don’t know who reviewed Bargh’s paper back in 1996, but I suspect that if they ever considered the seemingly implausible size of the effect being reported, they might have well thought to themselves, well, I’m not sure I believe it, but that’s okay–time will tell. Time did tell, of course; but time is kind of lazy, so it took fifteen years for it to tell. In an alternate universe, a reviewer might have said, well, this is a striking finding, but the effect seems implausibly large; I would like you to try to directly replicate it in your lab with a much larger sample first. I recognize that this is onerous and annoying, but my primary responsibility is to ensure that only reliable findings get into the literature, and inconveniencing you seems like a small price to pay. Plus, if the effect is really what you say it is, people will be all the more likely to believe you later on.

Or take the actor-observer asymmetry, which appears in just about every introductory psychology textbook written in the last 20 – 30 years. It states that people are relatively more likely to attribute their own behavior to situational factors, and relatively more likely to attribute other agents’ behaviors to those agents’ dispositions. When I slip and fall, it’s because the floor was wet; when you slip and fall, it’s because you’re dumb and clumsy. This putative asymmetry was introduced and discussed at length in a book by Jones and Nisbett in 1971, and hundreds of studies have investigated it at this point. And yet a 2006 meta-analysis by Malle suggested that the cumulative evidence for the actor-observer asymmetry is actually very weak. There are some specific circumstances under which you might see something like the postulated effect, but what is quite clear is that it’s nowhere near strong enough an effect to justify being routinely invoked by psychologists and even laypeople to explain individual episodes of behavior. Unfortunately, at this point it’s almost impossible to dislodge the actor-observer asymmetry from the psyche of most researchers–a reality underscored by the fact that the Jones and Nisbett book has been cited nearly 3,000 times, whereas the 1996 meta-analysis has been cited only 96 times (a very low rate for an important and well-executed meta-analysis published in Psychological Bulletin).

The fact that it can take many years–whether 15 or 45–for a literature to build up to the point where we’re even in a position to suggest with any confidence that an initially exciting finding could be wrong means that we should be very hesitant to appeal to long-term replication as an arbiter of truth. Replication may be the gold standard in the very long term, but in the short and medium term, appealing to replication is a huge cop-out. If you can see problems with an analysis right now that cast aspersions on a study’s results, it’s an abdication of responsibility to downplay your concerns and wait for someone else to come along and spend a lot more time and money trying to replicate the study. You should point out now why you have concerns. If the authors can address them, the results will look all the better for it. And if the authors can’t address your concerns, well, then, you’ve just done science a service. If it helps, don’t think of it as a matter of saying mean things about someone else’s work, or of asserting your own ego; think of it as potentially preventing a lot of very smart people from wasting a lot of time chasing down garden paths–and also saving a lot of taxpayer money. Remember that our job as scientists is not to make other scientists’ lives easy in the hopes they’ll repay the favor when we submit our own papers; it’s to establish and apply standards that produce convergence on the truth in the shortest amount of time possible.

“But it would hurt my career to be meticulously honest about everything I do!”

Unlike the other considerations listed above, I think the concern that being honest carries a price when it comes to do doing research has a good deal of merit to it. Given the aforementioned delay between initial publication and later disconfirmation of findings (which even in the best case is usually longer than the delay between obtaining a tenure-track position and coming up for tenure), researchers have many incentives to emphasize expediency and good story-telling over accuracy, and it would be disingenuous to suggest otherwise. No malevolence or outright fraud is implied here, mind you; the point is just that if you keep second-guessing and double-checking your analyses, or insist on routinely collecting more data than other researchers might think is necessary, you will very often find that results that could have made a bit of a splash given less rigor are actually not particularly interesting upon careful cross-examination. Which means that researchers who have, shall we say, less of a natural inclination to second-guess, double-check, and cross-examine their own work will, to some degree, be more likely to publish results that make a bit of a splash (it would be nice to believe that pre-publication peer review filters out sloppy work, but empirically, it just ain’t so). So this is a classic tragedy of the commons: what’s good for a given individual, career-wise, is clearly bad for the community as a whole.

I wish I had a good solution to this problem, but I don’t think there are any quick fixes. The long-term solution, as many people have observed, is to restructure the incentives governing scientific research in such a way that individual and communal benefits are directly aligned. Unfortunately, that’s easier said than done. I’ve written a lot both in papers (1, 2, 3) and on this blog (see posts linked here) about various ways we might achieve this kind of realignment, but what’s clear is that it will be a long and difficult process. For the foreseeable future, it will continue to be an understandable though highly lamentable defense to say that the cost of maintaining a career in science is that one sometimes has to play the game the same way everyone else plays the game, even if it’s clear that the rules everyone plays by are detrimental to the communal good.

 

Anyway, this may all sound a bit depressing, but I really don’t think it should be taken as such. Personally I’m actually very optimistic about the prospects for large-scale changes in the way we produce and evaluate science within the next few years. I do think we’re going to collectively figure out how to do science in a way that directly rewards people for employing research practices that are maximally beneficial to the scientific community as a whole. But I also think that for this kind of change to take place, we first need to accept that many of the defenses we routinely give for using iffy methodological practices are just not all that compelling.

on writing: some anecdotal observations, in no particular order

  • Early on in graduate school, I invested in the book “How to Write a Lot“. I enjoyed reading it–mostly because I (mistakenly) enjoyed thinking to myself, “hey, I bet as soon as I finish this book, I’m going to start being super productive!” But I can save you the $9 and tell you there’s really only one take-home point: schedule writing like any other activity, and stick to your schedule no matter what. Though, having said that, I don’t really do that myself. I find I tend to write about 20 hours a week on average. On a very good day, I manage to get a couple of thousand words written, but much more often, I get 200 words written that I then proceed to rewrite furiously and finally trash in frustration. But it all adds up in the long run I guess.
  • Some people are good at writing one thing at a time; they can sit down for a week and crank out a solid draft of a paper without every looking sideways at another project. Personally, unless I have a looming deadline (and I mean a real deadline–more on that below), I find that impossible to do; my general tendency is to work on one writing project for an hour or two, and then switch to something else. Otherwise I pretty much lose my mind. I also find it helps to reward myself–i.e., I’ll work on something I really don’t want to do for an hour, and then play video games for a while switch to writing something more pleasant.
  • I can rarely get any ‘real’ writing (i.e., stuff that leads to publications) done after around 6 pm; late mornings (i.e., right after I wake up) are usually my most productive writing time. And I generally only write for fun (blogging, writing fiction, etc.) after 9 pm. There are exceptions, but by and large that’s my system.
  • I don’t write many drafts. I don’t mean that I never revise papers, because I do–obsessively. But I don’t sit down thinking “I’m going to write a very rough draft, and then I’ll go back and clean up the language.” I sit down thinking “I’m going to write a perfect paper the first time around,” and then I very slowly crank out a draft that’s remarkably far from being perfect. I suspect the former approach is actually the more efficient one, but I can’t bring myself to do it. I hate seeing malformed sentences on the page, even if I know I’m only going to delete them later. It always amazes and impresses me when I get Word documents from collaborators with titles like “AmazingNatureSubmissionVersion18”. I just give my documents all the title “paper_draft”. There might be a V2 or a V3, but there will never, ever be a V18.
  • Papers are not meant to be written linearly. I don’t know anyone who starts with the Introduction, then does the Methods and Results, and then finishes with the Discussion. Personally I don’t even write papers one section at a time. I usually start out by frantically writing down ideas as they pop into my head, and jumping around the document as I think of other things I want to say. I frequently write half a sentence down and then finish it with a bunch of question marks (like so: ???) to indicate I need to come back later and patch it up. Incidentally, this is also why I’m terrified to ever show anyone any of my unfinished paper drafts: an unsuspecting reader would surely come away thinking I suffer from a serious thought disorder. (I suppose they might be right.)
  • Okay, that last point is not entirely true. I don’t write papers completely haphazardly; I do tend to write Methods and Results before Intro and Discussion. I gather that this is a pretty common approach. On the rare occasions when I’ve started writing the Introduction first, I’ve invariably ended up having to completely rewrite it, because it usually turns out the results aren’t actually what I thought they were.
  • My sense is that most academics get more comfortable writing as time goes on. Relatively few grad students have the perseverance to rapidly crank out publication-worthy papers from day 1 (I was definitely not one of them). I don’t think this is just a matter of practice; I suspect part of it is a natural maturation process. People generally get more conscientious as they age; it stands to reason that writing (as an activity most people find unpleasant) should get easier too. I’m better at motivating myself to write papers now, but I’m also much better about doing the dishes and laundry–and I’m pretty sure that’s not because practice makes dishwashing perfect.
  • When I started grad school, I was pretty sure I’d never publish anything, let alone graduate, because I’d never handed in a paper as an undergraduate that wasn’t written at the last minute, whereas in academia, there are virtually no hard deadlines (see below). I’m not sure exactly what changed. I’m still continually surprised every time something I wrote gets published. And I often catch myself telling myself, “hey, self, how the hell did you ever manage to pay attention long enough to write 5,000 words?” And then I reply to myself, “well, self, since you ask, I took a lot of stimulants.”
  • I pace around a lot when I write. A lot. To the point where my labmates–who are all uncommonly nice people–start shooting death glares my way. It’s a heritable tendency, I guess (the pacing, not the death glare attraction); my father also used to pace obsessively. I’m not sure what the biological explanation for it is. My best guess is it’s an arousal-mediated effect: I can think pretty well when I’m around other people, or when I’m in motion, but if I’m sitting at a desk and I don’t already know exactly what I want to say, I can’t get anything done. I generally pace around the lab or house for a while figuring out what I want to say, and then I sit down and write until I’ve forgotten what I want to say, or decide I didn’t really want to say that after all. In practice this usually works out to 10 minutes of pacing for every 5 minutes of writing. I envy people who can just sit down and calmly write for two or three hours without interruption (though I don’t think there are that many of them). At the same time, I’m pretty sure I burn a lot of calories this way.
  • I’ve been pleasantly surprised to discover that I much prefer writing grant proposals to writing papers–to the point where I actually enjoy writing grant proposals. I suspect the main reason for this is that grant proposals have a kind of openness that papers don’t; with a paper, you’re constrained to telling the story the data actually support, whereas a grant proposal is as good as your vision of what’s possible (okay, and plausible). A second part of it is probably the novelty of discovery: once you conduct your analyses, all that’s left is to tell other people what you found, which (to me) isn’t so exciting. I mean, I already think I know what’s going on; what do I care if you know? Whereas when writing a grant, a big part of the appeal for me is that I could actually go out and discover new stuff–just as long as I can convince someone to give me some money first.
  • At a a departmental seminar attended by about 30 people, I once heard a student express concern about an in-progress review article that he and several of the other people at the seminar were collaboratively working on. The concern was that if all of the collaborators couldn’t agree on what was going to go in the paper (and they didn’t seem to be able to at that point), the paper wouldn’t get written in time to make the rapidly approaching deadline dictated by the journal editor. A senior and very brilliant professor responded to the student’s concern by pointing out that this couldn’t possibly be a real problem seeing as in reality there is actually no such thing as a hard writing deadline. This observation didn’t go over so well with some of the other senior professors, who weren’t thrilled that their students were being handed the key to the kingdom of academic procrastination so early in their careers. But it was true, of course: with the major exception of grant proposals (EDIT: and as Garrett points out in the comments below, conference publications in disciplines like Computer Science), most of the things academics write (journal articles, reviews, commentaries, book chapters, etc.) operate on a very flexible schedule. Usually when someone asks you to write something for them, there is some vague mention somewhere of some theoretical deadline, which is typically a date that seems so amazingly far off into the future that you wonder if you’ll even be the same person when it rolls around. And then, much to your surprise, the deadline rolls around and you realize that you must in fact really bea different person, because you don’t seem to have any real desire to work on this thing you signed up for, and instead of writing it, why don’t you just ask the editor for an extension while you go rustle up some motivation. So you send a polite email, and the editor grudgingly says, “well, hmm, okay, you can have another two weeks,” to which you smile and nod sagely, and then, two weeks later, you send another similarly worded but even more obsequious email that starts with the words “so, about that extension…”

    The basic point here is that there’s an interesting dilemma: even though there rarely are any strict writing deadlines, it’s to almost everyone’s benefit to pretend they exist. If I ever find out that the true deadline (insofar as such a thing exists) for the chapter I’m working on right now is 6 months from now and not 3 months ago (which is what they told me), I’ll probably relax and stop working on it for, say, the next 5 and a half months. I sometimes think that the most productive academics are the ones who are just really really good at repeatedly lying to themselves.

  • I’m a big believer in structured procrastination when it comes to writing. I try to always have a really unpleasant but not-so-important task in the background, which then forces me to work on only-slightly-unpleasant-but-often-more-important tasks. Except it often turns out that the unpleasant-but-no-so-important task is actually an unpleasant-but-really-important task after all, and then I wake up in a cold sweat in the middle of the night thinking of all the ways I’ve screwed myself over. No, just kidding. I just bitch about it to my wife for a while and then drown my sorrows in an extra helping of ice cream.
  • I’m really, really, bad at restarting projects I’ve put on the back burner for a while. Right now there are 3 or 4 papers I’ve been working on on-and-off for 3 or 4 years, and every time I pick them up, I write a couple of hundred words and then put them away for a couple of months. I guess what I’m saying is that if you ever have the misfortune of collaborating on a paper with me, you should make sure to nag me several times a week until I get so fed up with you I sit down and write the damn paper. Otherwise it may never see the light of day.
  • I like writing fiction in my spare time. I also occasionally write whiny songs. I’m pretty terrible at both of these things, but I enjoy them, and I’m told (though I don’t believe it for a second) that that’s the important thing.

in praise of self-policing

It’s IRB week over at The Hardest Science; Sanjay has an excellent series of posts (1, 2, 3) discussing some proposed federal rule changes to the way IRBs oversee research. The short of it is that the proposed changes are mostly good news for people who do minimal risk-type research with human subjects (i.e., stuff that doesn’t involve poking people with needles); if the changes pass as written, most of us will no longer have to file any documents with our IRBs before running our studies. We’ll just put in a short note saying we’ve determined that our studies are excused from review, and then we can start collecting data right away. It’ll work something like this*:

This doesn’t mean federal oversight of human subjects research will cease, of course. There will still be guidelines we all have to follow. But instead of making researchers jump through flaming hoops preemptively, enforcement will take place on an ad-hoc basis and via random audits. For the most part, the important decisions will be left to investigators rather than IRBs. For more details, see Sanjay’s excellent breakdown.

I also agree with Sanjay’s sentiment in his latest post that this is the right way to do things; researchers should police themselves, rather than employing an entire staff of people whose jobs it is to tell researchers how to safely and ethically do their research. In principle, the idea of having trained IRB analysts go over every study sounds nice; the problem is that it takes a very long time, generates a lot of extra work for everyone, and perhaps most problematically, sets up all sorts of perverse incentives. Namely, IRB analysts have an incentive to be pedantic (since they rarely lose their jobs if they ask for too much detail, but could be liable if they give too much leeway and something bad happens), and investigators have an incentive to off-load their conscience onto the IRB rather than actually having to think about the impact of their experiment on subjects. I catch myself doing this more often than I’d like, and I’m not really happy about it. (For instance, I recently found myself telling someone it was okay for them to present gruesome pictures to subjects “because the IRB doesn’t mind that”, and not because I thought the psychological impact was negligible. I gave myself twenty lashes for that one**.) I suspect that, aside from saving everyone a good deal of time and effort, placing the responsibility of doing research on researchers’ shoulders would actually lead them to give more, and not less, consideration to ethical issues.

Anyway, it remains to be seen whether the proposed rules actually pass in their current form. One of the interesting features of the situation is that IRBs may now perversely actually have an incentive to fight against these rules going into effect, since they’d almost certainly need to lay off staff if we move to a system where most studies are entirely excused from review. I don’t really think that this will be much of an issue, and on balance I’m sure university administrations recognize how much IRBs slow down research; but it still can’t hurt for those of us who do research with human subjects to stick our heads past the Department of Health and Human Service’s doors and affirm that excusing most non-invasive human subjects research from review is the right thing to do.


* I know, I know. I managed to go two whole years on this blog without a single lolcat appearance, and now I throw it all away for this. Sorry.

** With a feather duster.

CNS 2011: a first-person shorthand account in the manner of Rocky Steps

Friday, April 1

4 pm. Arrive at SFO International on bumpy flight from Denver.

4:45 pm. Approach well-dressed man downtown and open mouth to ask for directions to Hyatt Regency San Francisco. “Sorry,” says well-dressed man, “No change to give.” Back off slowly, swinging bags, beard, and poster tube wildly, mumbling “I’m not a panhandler, I’m a neuroscientist.” Realize that difference between the two may be smaller than initially suspected.

6:30 pm. Hear loud knocking on hotel room door. Open door to find roommate. Say hello to roommate. Realize roommate is extremely drunk from East Coast flight. Offer roommate bag of coffee and orange tic-tacs. Roommate is confused, asks, “are you drunk?” Ignore roommate’s question. “You’re drunk, aren’t you.” Deny roommate’s unsubstantiated accusations. “When you write about this on your blog, you better not try to make it look like I’m the drunk one,” roommate says. Resolve to ignore roommate’s crazy talk for next 4 days.

6:45 pm. Attempt to open window of 10th floor hotel room in order to procure fresh air for face. Window refuses to open. Commence nudging of, screaming at, and bargaining with window. Window still refuses to open. Roommate points out sticker saying window does not open. Ignore sticker, continue berating window. Window still refuses to open, but now has low self-esteem.

8 pm. Have romantic candlelight dinner at expensive french restaurant with roommate. Make jokes all evening about ideal location (San Francisco) for start of new intimate relationship. Suspect roommate is uncomfortable, but persist in faux wooing. Roommate finally turns tables by offering to put out. Experience heightened level of discomfort, but still finish all of steak tartare and order creme brulee. Dessert appetite is immune to off-color humor!

11 pm – 1 am. Grand tour of seedy SF bars with roommate and old grad school friend. New nightlife low: denied entrance to seedy dance club because shoes insufficiently classy. Stupid Teva sandals.

Saturday, April 2

9:30 am. Wake up late. Contemplate running downstairs to check out ongoing special symposium for famous person who does important research. Decide against. Contemplate visiting hotel gym to work off creme brulee from last night. Decide against. Contemplate reading conference program in bed and circling interesting posters to attend. Decide against. Contemplate going back to sleep. Consult with self, make unanimous decision in favor.

1 pm. Have extended lunch meeting with collaborators at Ferry Building to discuss incipient top-secret research project involving diesel generator, overstock beanie babies, and apple core. Already giving away too much!

3:30 pm. Return to hotel. Discover hotel is now swarming with name badges attached to vaguely familiar faces. Hug vaguely familiar faces. Hugs are met with startled cries. Realize that vaguely familiar faces are actually completely unfamiliar faces. Wrong conference: Young Republicans, not Cognitive Neuroscientists. Make beeline for elevator bank, pursued by angry middle-aged men dressed in American flags.

5 pm. Poster session A! The sights! The sounds! The lone free drink at the reception! The wonders of yellow 8-point text on black 6′ x 4′ background! Too hard to pick a favorite thing, not even going to try. Okay, fine: free schwag at the exhibitor stands.

5 pm – 7 pm. Chat with old friends. Have good time catching up. Only non-fictionalized bullet point of entire piece.

8 pm. Dinner at belly dancing restaurant in lower Haight. Great conversation, good food, mediocre dancing. Towards end of night, insist on demonstrating own prowess in fine art of torso shaking; climb on table and gyrate body wildly, alternately singing Oompa-Loompa song and yelling “get in my belly!” at other restaurant patrons. Nobody tips.

12:30 am. Take the last train to Clarksville. Take last N train back to Hyatt Regency hotel.

Sunday, April 3

7 am. Wake up with amazing lack of hangover. Celebrate amazing lack of hangover by running repeated victory laps around 10th floor of Hyatt Regency, Rocky Steps style. Quickly realize initial estimate of hangover absence off by order of magnitude. Revise estimate; collapse in puddle on hotel room floor. Refuse to move until first morning session.

8:15 am. Wander the eight Caltech aisles of morning poster session in search of breakfast. Fascinating stuff, but this early in morning, only value signals of interest are smell and sight of coffee, muffins, and bagels.

10 am. Terrific symposium includes excellent talks about emotion, brain-body communication, and motivation, but favorite moment is still when friend arrives carrying bucket of aspirin.

1 pm. Bump into old grad school friend outside; decide to grab lunch on pier behind Ferry Building. Discuss anterograde amnesia and dating habits of mutual friends. Chicken and tofu cake is delicious. Sun is out, temperature is mild; perfect day to not attend poster sessions.

1:15 – 2 pm. Attend poster session.

2 pm – 5 pm. Presenting poster in 3 hours! Have full-blown panic attack in hotel room. Not about poster, about General Hospital. Why won’t Lulu take Dante’s advice and call support group number for alcoholics’ families?!?! Alcohol is Luke’s problem, Lulu! Call that number!

5 pm. Present world’s most amazing poster to three people. Launch into well-rehearsed speech about importance of work and great glory of sophisticated technical methodology before realizing two out of three people are mistakenly there for coffee and cake, and third person mistook presenter for someone famous. Pause to allow audience to mumble excuses and run to coffee bar. When coast is clear, resume glaring at anyone who dares to traverse poster aisle. Believe strongly in marking one’s territory.

8 pm. Lab dinner at House of Nanking. Food is excellent, despite unreasonably low tablespace-to-floorspace ratio. Conversation revolves around fainting goats, ‘relaxation’ in Thailand, and, occasionally, science.

10 pm. Karaoke at The Mint. Compare performance of CNS attendees with control group of regulars; establish presence of robust negative correlation between years of education and singing ability. Completely wreck voice performing whitest rendition ever of Shaggy’s “Oh Carolina”. Crowd jeers. No, wait, crowd gyrates. In wholesome scientific manner. Crowd is composed entirely of people with low self-monitoring skills; what luck! DJ grimaces through entire song and most of previous and subsequent songs.

2 am. Take cab back to hotel with graduate students and Memory Professor. Memory Professor is drunk; manages to nearly fall out of cab while cab in motion. In-cab conversation revolves around merits of dynamic programming languages. No consensus reached, but civility maintained. Arrival at hotel: all cab inhabitants below professorial rank immediately slip out of cab and head for elevators, leaving Memory Professor to settle bill. In elevator, Graduate Student A suggests that attempt to push Memory Professor out of moving cab was bad idea in view of Graduate Student A’s impending post-doc with Memory Professor. Acknowledge probable wisdom of Graduate Student A’s observation while simultaneously resolving to not adjust own degenerate behavior in the slightest.

2:15 am. Drink at least 24 ounces of water before attaining horizontal position. Fall asleep humming bars of Elliott Smith’s Angeles. Wrong city, but close enough.

Monday, April 4

8 am. Wake up hangover free again! For real this time. No Rocky Steps dance. Shower and brush teeth. Delicately stroke roommate’s cheek (he’ll never know) before heading downstairs for poster session.

8:30 am. Bagels, muffin, coffee. Not necessarily in that order.

9 am – 12 pm. Skip sessions, spend morning in hotel room working. While trying to write next section of grant proposal, experience strange sensation of time looping back on itself, like a snake eating its own tail, but also eating grant proposal at same time. Awake from unexpected nap with ‘Innovation’ section in mouth.

12:30 pm. Skip lunch; for some reason, not very hungry.

1 pm. Visit poster with screaming purple title saying “COME HERE FOR FREE CHOCOLATE.” Am impressed with poster title and poster, but disappointed by free chocolate selection: Dove eggs and purple Hershey’s kisses–worst chocolate in the world! Resolve to show annoyance by disrupting presenter’s attempts to maintain conversation with audience. Quickly knocked out by chocolate eggs thrown by presenter.

5 pm. Wake up in hotel room with headache and no recollection of day’s events. Virus or hangover? Unclear. For some reason, hair smells like chocolate.

7:30 pm. Dinner at Ferry Building with Brain Camp friends. Have now visited Ferry Building at least one hundred times in seventy-two hours. Am now compulsively visiting Ferry Building every fifteen minutes just to feel normal.

9:30 pm. Party at Americano Restaurant & Bar for Young Investigator Award winner. Award comes with $500 and strict instructions to be spent on drinks for total strangers. Strange tradition, but noone complains.

11 pm. Bar is crowded with neuroscientists having great time at Young Investigator’s expense.

11:15 pm. Drink budget runs out.

11:17 pm. Neuroscientists mysteriously vanish.

1 am. Stroll through San Francisco streets in search of drink. Three false alarms, but finally arrive at open pub 10 minutes before last call. Have extended debate with friend over whether hotel room can be called ‘home’. Am decidedly in No camp; ‘home’ is for long-standing attachments, not 4-day hotel hobo runs.

2 am. Walk home.

Tuesday, April 5

9:05 am. Show up 5 minutes late for bagels and muffins. All gone! Experience Apocalypse Now moment on inside, but manage not to show it–except for lone tear. Drown sorrows in Tazo Wild Sweet Orange tea. Tea completely fails to live up to name; experience second, smaller, Apocalypse Now moment. Roommate walks over and asks if everything okay, then gently strokes cheek and brushes away lone tear (he knew!!!).

9:10 – 1 pm. Intermittently visit poster and symposium halls. Not sure why. Must be force of habit learning system.

1:30 pm. Lunch with friends at Thai restaurant near Golden Gate Park. Fill belly up with coconut, noodles, and crab. About to get on table to express gratitude with belly dance, but notice that friends have suddenly disappeared.

2 – 5 pm. Roam around Golden Gate Park and Haight-Ashbury. Stop at Whole Foods for friend to use bathroom. Get chased out of Whole Foods for using bathroom without permission. Very exciting; first time feeling alive on entire trip! Continue down Haight. Discuss socks, ice cream addiction (no such thing), and funding situation in Europe. Turns out it sucks there too.

5:15 pm. Take BART to airport with lab members. Watch San Francisco recede behind train. Sink into slightly melancholic state, but recognize change of scenery is for the best: constitution couldn’t handle more Rocky Steps mornings.

7:55 pm. Suddenly rediscover pronouns as airplane peels away from gate.

8 pm PST – 11:20 MST. The flight’s almost completely empty; I get to stretch out across the entire emergency exit aisle. The sun goes down as we cross the Sierra Nevada; the last of the ice in my cup melts into water somewhere between Provo and Grand Junction. As we start our descent into Denver, the lights come out in force, and I find myself preemptively bored at the thought of the long shuttle ride home. For a moment, I wish I was back in my room at the Hyatt at 8 am–about to run Rocky Steps around the hotel, or head down to the poster hall to find someone to chat with over a bagel and coffee. For some reason, I still feel like I didn’t get quite enough time to hang out with all the people I wanted to see, despite barely sleeping in 4 days. But then sanity returns, and the thought quickly passes.

what Paul Meehl might say about graduate school admissions

Sanjay Srivastava has an excellent post up today discussing the common belief among many academics (or at least psychologists) that graduate school admission interviews aren’t very predictive of actual success, and should be assigned little or no weight when making admissions decisions:

The argument usually goes something like this: “All the evidence from personnel selection studies says that interviews don’t predict anything. We are wasting people’s time and money by interviewing grad students, and we are possibly making our decisions worse by substituting bad information for good.”

I have been hearing more or less that same thing for years, starting when I was grad school myself. In fact, I have heard it often enough that, not being familiar with the literature myself, I accepted what people were saying at face value. But I finally got curious about what the literature actually says, so I looked it up.

I confess that I must have been drinking from the kool-aid spigot, because until I read Sanjay’s post, I’d long believed something very much like this myself, and for much the same reason. I’d never bothered to actually, you know, look at the data myself. Turns out the evidence and the kool-aid are not compatible:

A little Google Scholaring for terms like “employment interviews” and “incremental validity” led me to a bunch of meta-analyses that concluded that in fact interviews can and do provide useful information above and beyond other valid sources of information (like cognitive ability tests, work sample tests, conscientiousness, etc.). One of the most heavily cited is a 1998 Psych Bulletin paper by Schmidt and Hunter (link is a pdf; it’s also discussed in this blog post). Another was this paper by Cortina et al, which makes finer distinctions among different kinds of interviews. The meta-analyses generally seem to agree that (a) interviews correlate with job performance assessments and other criterion measures, (b) interviews aren’t as strong predictors as cognitive ability, (c) but they do provide incremental (non-overlapping) information, and (d) in those meta-analyses that make distinctions between different kinds of interviews, structured interviews are better than unstructured interviews.

This seems entirely reasonable, and I agree with Sanjay that it clearly shows that admissions interviews aren’t useless, at least in an actuarial sense. That said, after thinking about it for a while, I’m not sure these findings really address the central question admissions committees care about. When deciding which candidates to admit as students, the relevant question isn’t really what factors predict success in graduate school?, it’s what factors should the admissions committee attend to when making a decision? These may seem like the same thing, but they’re not. And the reason they’re not is that knowing which factors are predictive of success is no guarantee that faculty are actually going to be able to use that information in an appropriate way. Knowing what predicts performance is only half the story, as it were; you also need to know exactly how to weight different factors appropriately in order to generate an optimal prediction.

In practice, humans turn out to be incredibly bad at predicting outcomes based on multiple factors. An enormous literature on mechanical (or actuarial) prediction, which Sanjay mentions in his post, has repeatedly demonstrated that in many domains, human judgments are consistently and often substantially outperformed by simple regression equations. There are several reasons for this gap, but one of the biggest ones is that people are just shitty at quantitatively integrating multiple continuous variables. When you visit a car dealership, you may very well be aware that your long-term satisfaction with any purchase is likely to depend on some combination of horsepower, handling, gas mileage, seating comfort, number of cupholders, and so on. But the odds that you’ll actually be able to combine that information in an optimal way are essentially nil. Our brains are simply not designed to work that way; you can’t internally compute the value you’d get out of a car using an equation like 1.03*cupholders + 0.021*horsepower + 0.3*mileage. Some of us try to do it that way–e.g., by making very long pro and con lists detailing all the relevant factors we can possibly think of–but it tends not to work out very well (e.g., you total up the numbers and realize, hey, that’s not the answer I wanted! And then you go buy that antique ’68 Cadillac you had your eye on the whole time you were pretending to count cupholders in the Nissan Maxima).

Admissions committees face much the same problem. The trouble lies not so much in determining which factors predict graduate school success (or, for that matter, many other outcomes we care about in daily life), but in determining how to best combine them. Knowing that interview performance incrementally improves predictions is only useful if you can actually trust decision-makers to weight that variable very lightly relative to other more meaningful predictors like GREs and GPAs. And that’s a difficult proposition, because I suspect that admissions discussions rarely go like this:

Faculty Member 1: I think we should accept Candidate X. Her GREs are off the chart, great GPA, already has two publications.
Faculty Member 2: I didn’t like X at all. She didn’t seem very excited to be here.
FM1: Well, that doesn’t matter so much. Unless you really got a strong feeling that she wouldn’t stick it out in the program, it probably won’t make much of a difference, performance-wise.
FM2: Okay, fine, we’ll accept her.

And more often go like this:

FM1: Let’s take Candidate X. Her GREs are off the chart, great GPA, already has two publications.
FM2: I didn’t like X at all. She didn’t seem very excited to be here.
FM1: Oh, you thought so too? That’s kind of how I felt too, but I didn’t want to say anything.
FM2: Okay, we won’t accept X. We have plenty of other good candidates with numbers that are nearly as good and who seemed more pleasant.

Admittedly, I don’t have any direct evidence to back up this conjecture. Except that I think it would be pretty remarkable if academic faculty departed from experts in pretty much every other domain that’s been tested (clinical practice, medical diagnosis, criminal recidivism, etc.) and were actually able to do as well (or even close to as well) as a simple regression equation. For what it’s worth, in many of the studies of mechanical prediction, the human experts are explicitly given all of the information passed to the prediction equation, and still do relatively poorly. In other words, you can hand a clinical psychologist a folder full of quantitative information about a patient, tell them to weight it however they want, and even the best clinicians are still going to be outperformed by a mechanical prediction (if you doubt this to be true, I second Sanjay in directing you to Paul Meehl’s seminal body of work–truly some of the most important and elegant work ever done in psychology, and if you haven’t read it, you’re missing out). And in some sense, faculty members aren’t really even experts about admissions, since they only do it once a year. So I’m pretty skeptical that admissions committees actually manage to weight their firsthand personal experience with candidates appropriately when making their final decisions. It seems much more likely that any personality impressions they come away with will just tend to drown out prior assessments based on (relatively) objective data.

That all said, I couldn’t agree more with Sanjay’s ultimate conclusion, so I’ll just end with this quote:

That, of course, is a testable question. So if you are an evidence-based curmudgeon, you should probably want some relevant data. I was not able to find any studies that specifically addressed the importance of rapport and interest-matching as predictors of later performance in a doctoral program. (Indeed, validity studies of graduate admissions are few and far between, and the ones I could find were mostly for medical school and MBA programs, which are very different from research-oriented Ph.D. programs.) It would be worth doing such studies, but not easy.

Oh, except that I do want to add that I really like the phrase “evidence-based curmudgeon“, and I’m totally stealing it.

will trade two Methods sections for twenty-two subjects worth of data

The excellent and ever-candid Candid Engineer in Academia has an interesting post discussing the love-hate relationship many scientists who work in wet labs have with benchwork. She compares two very different perspectives:

She [a current student] then went on to say that, despite wanting to go to grad school, she is pretty sure she doesn’t want to continue in academia beyond the Ph.D. because she just loves doing the science so much and she can’t imagine ever not being at the bench.

Being young and into the benchwork, I remember once asking my grad advisor if he missed doing experiments. His response: “Hell no.” I didn’t understand it at the time, but now I do. So I wonder if my student will always feel the way she does now- possessing of that unbridled passion for the pipet, that unquenchable thirst for the cell culture hood.

Wet labs are pretty much nonexistent in psychology–I’ve never had to put on gloves or goggles to do anything that I’d consider an “experiment”, and I’ve certainly never run the risk of  spilling dangerous chemicals all over myself–so I have no opinion at all about benchwork. Maybe I’d love it, maybe I’d hate it; I couldn’t tell you. But Candid Engineer’s post did get me thinking about opinions surrounding the psychological equivalent of benchwork–namely, collecting data form human subjects. My sense is that there’s somewhat more consensus among psychologists, in that most of us don’t seem to like data collection very much. But there are plenty of exceptions, and there certainly are strong feelings on both sides.

More generally, I’m perpetually amazed at the wide range of opinions people can hold about the various elements of scientific research, even when the people doing the different-opinion-holding all work in very similar domains. For instance, my favorite aspect of the research I do, hands down, is data analysis. I’d be ecstatic if I could analyze data all day and never have to worry about actually communicating the results to anyone (though I enjoy doing that too). After that, there are activities like writing and software development, which I spend a lot of time doing, and occasionally enjoy, but also frequently find very frustrating. And then, at the other end, there are aspects of research that I find have little redeeming value save for their instrumental value in supporting other, more pleasant, activities–nasty, evil activities like writing IRB proposals and, yes, collecting data.

To me, collecting data is something you do because you’re fundamentally interested in some deep (or maybe not so deep) question about how the mind works, and the only way to get an answer is to actually interrogate people while they do stuff in a controlled environment. It isn’t something I do for fun. Yet I know people who genuinely seem to love collecting data–or, for that matter, writing Methods sections or designing new experiments–even as they loathe perfectly pleasant activities like, say, sitting down to analyze the data they’ve collected, or writing a few lines of code that could save them hours’ worth of manual data entry. On a personal level, I find this almost incomprehensible: how could anyone possibly enjoy collecting data more than actually crunching the numbers and learning new things? But I know these people exist, because I’ve talked to them. And I recognize that, from their perspective, I’m the guy with the strange views. They’re sitting there thinking: what kind of joker actually likes to turn his data inside out several dozen times? What’s wrong with just running a simple t-test and writing up the results as fast as possible, so you can get back to the pleasure of designing and running new experiments?

This of course leads us directly to the care bears fucking tea party moment where I tell you how wonderful it is that we all have these different likes and dislikes. I’m not being sarcastic; it really is great. Ultimately, it works to everyone’s advantage that we enjoy different things, because it means we get to collaborate on projects and take advantage of complementary strengths and interests, instead of all having to fight over who gets to write the same part of the Methods section. It’s good that there are some people who love benchwork and some people who hate it, and it’s good that there are people who’re happy to write software that other people who hate writing software can use. We don’t all have to pretend we understand each other; it’s enough just to nod and smile and say “but of course you can write the Methods for that paper; I really don’t mind. And yes, I guess I can run some additional analyses for you, really, it’s not too much trouble at all.”