Category Archives: academics

whether or not you should pursue a career in science still depends mostly on that thing that is you

I took the plunge a couple of days ago and answered my first question on Quora. Since Brad Voytek won’t shut up about how great Quora is, I figured I should give it a whirl. So far, Brad is not wrong.

The question in question is: “How much do you agree with Johnathan Katz’s advice on (not) choosing science as a career? Or how realistic is it today (the article was written in 1999)?” The Katz piece referred to is here. The gist of it should be familiar to many academics; the argument boils down to the observation that relatively few people who start graduate programs in science actually end up with permanent research positions, and even then, the need to obtain funding often crowds out the time one has to do actual science. Katz’s advice is basically: don’t pursue a career in science. It’s not an optimistic piece.

My answer is, I think, somewhat more optimistic. Here’s the full text:

The real question is what you think it means to be a scientist. Science differs from many other professions in that the typical process of training as a scientist–i.e., getting a Ph.D. in a scientific field from a major research university–doesn’t guarantee you a position among the ranks of the people who are training you. In fact, it doesn’t come close to guaranteeing it; the proportion of PhD graduates in science who go on to obtain tenure-track positions at research-intensive universities is very small–around 10% in most recent estimates. So there is a very real sense in which modern academic science is a bit of a pyramid scheme: there are a relatively small number of people at the top, and a lot of people on the rungs below laboring to get up to the top–most of whom will, by definition, fail to get there.

If you equate a career in science solely with a tenure-track position at a major research university, and are considering the prospect of a Ph.D. in science solely as an investment intended to secure that kind of position, then Katz’s conclusion is difficult to escape. He is, in most respects, correct: in most biomedical, social, and natural science fields, science is now an extremely competitive enterprise. Not everyone makes it through the PhD; of those who do, not everyone makes it into–and then through–one more more postdocs; and of those who do that, relatively few secure tenure-track positions. Then, of those few “lucky” ones, some will fail to get tenure, and many others will find themselves spending much or most of their time writing grants and managing people instead of actually doing science. So from that perspective, Katz is probably right: if what you mean when you say you want to become a scientist is that you want to run your own lab at a major research university, then your odds of achieving that at the outset are probably not very good (though, to be clear, they’re still undoubtedly better than your odds of becoming a successful artist, musician, or professional athlete). Unless you have really, really good reasons to think that you’re particularly brilliant, hard-working, and creative (note: undergraduate grades, casual feedback from family and friends, and your own internal gut sense do not qualify as really, really good reasons), you probably should not pursue a career in science.

But that’s only true given a rather narrow conception where your pursuit of a scientific career is motivated entirely by the end goal rather than by the process, and where failure is anything other than ending up with a permanent tenure-track position. By contrast, if what you’re really after is an environment in which you can pursue interesting questions in a rigorous way, surrounded by brilliant minds who share your interests, and with more freedom than you might find at a typical 9 to 5 job, the dream of being a scientist is certainly still alive, and is worth pursuing. The trivial demonstration of this is that if you’re one of the many people who actuallyenjoy the graduate school environment (yes, they do exist!), it may not even matter to you that much whether or not you have a good shot of getting a tenure-track position when you graduate.

To see this, imagine that you’ve just graduated with an undergraduate degree in science, and someone offers you a choice between two positions for the next six years. One position is (relatively) financially secure, but involves rather boring work of quesitonable utility to society, an inflexible schedule, and colleagues who are mostly only there for a paycheck. The other position has terrible pay, but offers fascinating and potentially important work, a flexible lifestyle, and colleagues who are there because they share your interests and want to do scientific research.

Admittedly, real-world choices are rarely this stark. Many non-academic jobs offer many of the same perceived benefits of academia (e.g., many tech jobs offer excellent working conditions, flexible schedules, and important work). Conversely, many academic environments don’t quite live up to the ideal of a place where you can go to pursue your intellectual passion unfettered by the annoyances of “real” jobs–there’s often just as much in the way of political intrigue, personality dysfunction, and menial due-paying duties. But to a first approximation, this is basically the choice you have when considering whether to go to graduate school in science or pursue some other career: you’re trading financial security and a fixed 40-hour work week against intellectual engagement and a flexible lifestyle. And the point to note is that, even if we completely ignore what happens after the six years of grad school are up, there is clearly a non-negligible segment of the population who would quite happy opt for the second choice–even recognizing full well that at the end of six years they may have to leave and move onto something else, with little to show for their effort. (Of course, in reality we don’t need to ignore what happens after six years, because many PhDs who don’t get tenure-track positions find rewarding careers in other fields–many of them scientific in nature. And, even though it may not be a great economic investment, having a Ph.D. in science is a great thing to be able to put on one’s resume when applying for a very broad range of non-academic positions.)

The bottom line is that whether or not you should pursue a career in science has as much or more to do with your goals and personality as it does with the current environment within or outside of (academic) science. In an ideal world (which is certainly what the 1970s as described by Katz sound like, though I wasn’t around then), it wouldn’t matter: if you had any inkling that you wanted to do science for a living, you would simply go to grad school in science, and everything would probably work itself out. But given real-world constraints, it’s absolutely essentially that you think very carefully about what kind of environment makes you happy and what your expectations and goals for the future are. You have to ask yourself: Am I the kind of person who values intellectual freedom more than financial security? Do I really love the process of actually doing science–not some idealized movie version of it, but the actual messy process–enough to warrant investing a huge amount of my time and energy over the next few years? Can I deal with perpetual uncertainty about my future? And ultimately, would I be okay doing something that I really enjoy for six years if at the end of that time I have to walk away and do something very different?

If the answer to all of these questions is yes–and for many people it is!–then pursuing a career in science is still a very good thing to do (and hey, you can always quit early if you don’t like it–then you’ve lost very little time!). If the answer to any of them is no, then Katz may be right. A prospective career in science may or may not be for you, but at the very least, you should carefully consider alternative prospects. There’s absolutely no shame in going either route; the important thing is just to make an honest decision that takes the facts as they are and not as you wish that they were.

A couple of other thoughts I’ll add belatedly:

  • Calling academia a pyramid scheme is admittedly a bit hyperbolic. It’s true that the personnel structure in academia broadly has the shape of a pyramid, but that’s true of most organizations in most other domains too. Pyramid schemes are typically built on promises and lies that (almost by definition) can’t be realized, and I don’t think many people who enter a Ph.D. program in science can claim with a straight face that they were guaranteed a permanent research position at the end of the road (or that it’s impossible to get such a position). As I suggested in this post, it’s much more likely that everyone involved is simply guilty of minor (self-)deception: faculty don’t go out of their way to tell prospective students what the odds are of actually getting a tenure-track position, and prospective grad students don’t work very hard to find out the painful truth, or to tell faculty what their real intentions are after they graduate. And it may actually be better for everyone that way.
  • Just in case it’s not clear from the above, I’m not in any way condoning the historically low levels of science funding, or the fact that very few science PhDs go on to careers in academic research. I would love for NIH and NSF budgets (or whatever your local agency is) to grow substantially–and for everyone get exactly the kind of job they want, academic or not. But that’s not the world we live in, so we may as well be pragmatic about it and try to identify the conditions under which it does or doesn’t make sense to pursue a career in science right now.
  • I briefly mention this above, but it’s probably worth stressing that there are many jobs outside of academia that still allow one to do scientific research, albeit typically with less freedom (but often for better hours and pay). In particular, the market for data scientists is booming right now, and many of the hires are coming directly from academia. One lesson to take away from this is: if you’re in a science Ph.D. program right now, you should really spend as much time as you can building up your quantitative and technical skills, because they could very well be the difference between a job that involves scientific research and one that doesn’t in the event you leave academia. And those skills will still serve you well in your research career even if you end up staying in academia.

 

the truth is not optional: five bad reasons (and one mediocre one) for defending the status quo

You could be forgiven for thinking that academic psychologists have all suddenly turned into professional whistleblowers. Everywhere you look, interesting new papers are cropping up purporting to describe this or that common-yet-shady methodological practice, and telling us what we can collectively do to solve the problem and improve the quality of the published literature. In just the last year or so, Uri Simonsohn introduced new techniques for detecting fraud, and used those tools to identify at least 3 cases of high-profile, unabashed data forgery. Simmons and colleagues reported simulations demonstrating that standard exploitation of research degrees of freedom in analysis can produce extremely high rates of false positive findings. Pashler and colleagues developed a “Psych file drawer” repository for tracking replication attempts. Several researchers raised trenchant questions about the veracity and/or magnitude of many high-profile psychological findings such as John Bargh’s famous social priming effects. Wicherts and colleagues showed that authors of psychology articles who are less willing to share their data upon request are more likely to make basic statistical errors in their papers. And so on and so forth. The flood shows no signs of abating; just last week, the APS journal Perspectives in Psychological Science announced that it’s introducing a new “Registered Replication Report” section that will commit to publishing pre-registered high-quality replication attempts, irrespective of their outcome.

Personally, I think these are all very welcome developments for psychological science. They’re solid indications that we psychologists are going to be able to police ourselves successfully in the face of some pretty serious problems, and they bode well for the long-term health of our discipline. My sense is that the majority of other researchers–perhaps the vast majority–share this sentiment. Still, as with any zeitgeist shift, there are always naysayers. In discussing these various developments and initiatives with other people, I’ve found myself arguing, with somewhat surprising frequency, with people who for various reasons think it’s not such a good thing that Uri Simonsohn is trying to catch fraudsters, or that social priming findings are being questioned, or that the consequences of flexible analyses are being exposed. Since many of the arguments I’ve come across tend to recur, I thought I’d summarize the most common ones here–along with the rebuttals I usually offer for why, with one possible exception, the arguments for giving a pass to sloppy-but-common methodological practices are not very compelling.

“But everyone does it, so how bad can it be?”

We typically assume that long-standing conventions must exist for some good reason, so when someone raises doubts about some widespread practice, it’s quite natural to question the person raising the doubts rather than the practice itself. Could it really, truly be (we say) that there’s something deeply strange and misguided about using p values? Is it really possible that the reporting practices converged on by thousands of researchers in tens of thousands of neuroimaging articles might leave something to be desired? Could failing to correct for the many researcher degrees of freedom associated with most datasets really inflate the false positive rate so dramatically?

The answer to all these questions, of course, is yes–or at least, we should allow that it could be yes. It is, in principle, entirely possible for an entire scientific field to regularly do things in a way that isn’t very good. There are domains where appeals to convention or consensus make perfect sense, because there are few good reasons to do things a certain way except inasmuch as other people do them the same way. If everyone else in your country drives on the right side of the road, you may want to consider driving on the right side of the road too. But science is not one of those domains. In science, there is no intrinsic benefit to doing things just for the sake of convention. In fact, almost by definition, major scientific advances are ones that tend to buck convention and suggest things that other researchers may not have considered possible or likely.

In the context of common methodological practice, it’s no defense at all to say but everyone does it this way, because there are usually relatively objective standards by which we can gauge the quality of our methods, and it’s readily apparent that there are many cases where the consensus approach leave something to be desired. For instance, you can’t really justify failing to correct for multiple comparisons when you report a single test that’s just barely significant at p < .05 on the grounds that nobody else corrects for multiple comparisons in your field. That may be a valid explanation for why your paper successfully got published (i.e., reviewers didn’t want to hold your feet to the fire for something they themselves are guilty of in their own work), but it’s not a valid defense of the actual science. If you run a t-test on randomly generated data 20 times, you will, on average, get a significant result, p < .05, once. It does no one any good to argue that because the convention in a field is to allow multiple testing–or to ignore statistical power, or to report only p values and not effect sizes, or to omit mention of conditions that didn’t ‘work’, and so on–it’s okay to ignore the issue. There’s a perfectly reasonable question as to whether it’s a smart career move to start imposing methodological rigor on your work unilaterally (see below), but there’s no question that the mere presence of consensus or convention surrounding a methodological practice does not make that practice okay from a scientific standpoint.

“But psychology would break if we could only report results that were truly predicted a priori!”

This is a defense that has some plausibility at first blush. It’s certainly true that if you force researchers to correct for multiple comparisons properly, and report the many analyses they actually conducted–and not just those that “worked”–a lot of stuff that used to get through the filter will now get caught in the net. So, by definition, it would be harder to detect unexpected effects in one’s data–even when those unexpected effects are, in some sense, ‘real’. But the important thing to keep in mind is that raising the bar for what constitutes a believable finding doesn’t actually prevent researchers from discovering unexpected new effects; all it means is that it becomes harder to report post-hoc results as pre-hoc results. It’s not at all clear why forcing researchers to put in more effort validating their own unexpected finding is a bad thing.

In fact, forcing researchers to go the extra mile in this way would have one exceedingly important benefit for the field as a whole: it would shift the onus of determining whether an unexpected result is plausible enough to warrant pursuing away from the community as a whole, and towards the individual researcher who discovered the result in the first place. As it stands right now, if I discover an unexpected result (p < .05!) that I can make up a compelling story for, there’s a reasonable chance I might be able to get that single result into a short paper in, say, Psychological Science. And reap all the benefits that attend getting a paper into a “high-impact” journal. So in practice there’s very little penalty to publishing questionable results, even if I myself am not entirely (or even mostly) convinced that those results are reliable. This state of affairs is, to put it mildly, not A Good Thing.

In contrast, if you as an editor or reviewer start insisting that I run another study that directly tests and replicates my unexpected finding before you’re willing to publish my result, I now actually have something at stake. Because it takes time and money to run new studies, I’m probably not going to bother to follow up on my unexpected finding unless I really believe it. Which is exactly as it should be: I’m the guy who discovered the effect, and I know about all the corners I have or haven’t cut in order to produce it; so if anyone should make the decision about whether to spend more taxpayer money chasing the result, it should be me. You, as the reviewer, are not in a great position to know how plausible the effect truly is, because you have no idea how many different types of analyses I attempted before I got something to ‘work’, or how many failed studies I ran that I didn’t tell you about. Given the huge asymmetry in information, it seems perfectly reasonable for reviewers to say, You think you have a really cool and unexpected effect that you found a compelling story for? Great; go and directly replicate it yourself and then we’ll talk.

“But mistakes happen, and people could get falsely accused!”

Some people don’t like the idea of a guy like Simonsohn running around and busting people’s data fabrication operations for the simple reason that they worry that the kind of approach Simonsohn used to detect fraud is just not that well-tested, and that if we’re not careful, innocent people could get swept up in the net. I think this concern stems from fundamentally good intentions, but once again, I think it’s also misguided.

For one thing, it’s important to note that, despite all the press, Simonsohn hasn’t actually done anything qualitatively different from what other whistleblowers or skeptics have done in the past. He may have suggested new techniques that improve the efficiency with which cheating can be detected, but it’s not as though he invented the ability to report or investigate other researchers for suspected misconduct. Researchers suspicious of other researchers’ findings have always used qualitatively similar arguments to raise concerns. They’ve said things like, hey, look, this is a pattern of data that just couldn’t arise by chance, or, the numbers are too similar across different conditions.

More to the point, perhaps, no one is seriously suggesting that independent observers shouldn’t be allowed to raise their concerns about possible misconduct with journal editors, professional organizations, and universities. There really isn’t any viable alternative. Naysayers who worry that innocent people might end up ensnared by false accusations presumably aren’t suggesting that we do away with all of the existing mechanisms for ensuring accountability; but since the role of people like Simonsohn is only to raise suspicion and provide evidence (and not to do the actual investigating or firing), it’s clear that there’s no way to regulate this type of behavior even if we wanted to (which I would argue we don’t). If I wanted to spend the rest of my life scanning the statistical minutiae of psychology articles for evidence of misconduct and reporting it to the appropriate authorities (and I can assure you that I most certainly don’t), there would be nothing anyone could do to stop me, nor should there be. Remember that accusing someone of misconduct is something anyone can do, but establishing that misconduct has actually occurred is a serious task that requires careful internal investigation. No one–certainly not Simonsohn–is suggesting that a routine statistical test should be all it takes to end someone’s career. In fact, Simonsohn himself has noted that he identified a 4th case of likely fraud that he dutifully reported to the appropriate authorities only to be met with complete silence. Given all the incentives universities and journals have to look the other way when accusations of fraud are made, I suspect we should be much more concerned about the false negative rate than the false positive rate when it comes to fraud.

“But it hurts the public’s perception of our field!”

Sometimes people argue that even if the field does have some serious methodological problems, we still shouldn’t discuss them publicly, because doing so is likely to instill a somewhat negative view of psychological research in the public at large. The unspoken implication being that, if the public starts to lose confidence in psychology, fewer students will enroll in psychology courses, fewer faculty positions will be created to teach students, and grant funding to psychologists will decrease. So, by airing our dirty laundry in public, we’re only hurting ourselves. I had an email exchange with a well-known researcher to exactly this effect a few years back in the aftermath of the Vul et al “voodoo correlations” paper–a paper I commented on to the effect that the problem was even worse than suggested. The argument my correspondent raised was, in effect, that we (i.e., neuroimaging researchers) are all at the mercy of agencies like NIH to keep us employed, and if it starts to look like we’re clowning around, the unemployment rate for people with PhDs in cognitive neuroscience might start to rise precipitously.

While I obviously wouldn’t want anyone to lose their job or their funding solely because of a change in public perception, I can’t say I’m very sympathetic to this kind of argument. The problem is that it places short-term preservation of the status quo above both the long-term health of the field and the public’s interest. For one thing, I think you have to be quite optimistic to believe that some of the questionable methodological practices that are relatively widespread in psychology (data snooping, selective reporting, etc.) are going to sort themselves out naturally if we just look the other way and let nature run its course. The obvious reason for skepticism in this regard is that many of the same criticisms have been around for decades, and it’s not clear that anything much has improved. Maybe the best example of this is Gigerenzer and Sedlmeier’s 1989 paper entitled “Do studies of statistical power have an effect on the power of studies?“, in which the authors convincingly showed that despite three decades of work by luminaries like Jacob Cohen advocating power analyses, statistical power had not risen appreciably in psychology studies. The presence of such unwelcome demonstrations suggests that sweeping our problems under the rug in the hopes that someone (the mice?) will unobtrusively take care of them for us is wishful thinking.

In any case, even if problems did tend to solve themselves when hidden away from the prying eyes of the media and public, the bigger problem with what we might call the “saving face” defense is that it is, fundamentally, an abuse of taxypayers’ trust. As with so many other things, Richard Feynman summed up the issue eloquently in his famous Cargo Cult science commencement speech:

For example, I was a little surprised when I was talking to a friend who was going to go on the radio. He does work on cosmology and astronomy, and he wondered how he would explain what the applications of this work were. “Well,” I said, “there aren’t any.” He said, “Yes, but then we won’t get support for more research of this kind.” I think that’s kind of dishonest. If you’re representing yourself as a scientist, then you should explain to the layman what you’re doing–and if they don’t want to support you under those circumstances, then that’s their decision.

The fact of the matter is that our livelihoods as researchers depend directly on the goodwill of the public. And the taxpayers are not funding our research so that we can “discover” interesting-sounding but ultimately unreplicable effects. They’re funding our research so that we can learn more about the human mind and hopefully be able to fix it when it breaks. If a large part of the profession is routinely employing practices that are at odds with those goals, it’s not clear why taxpayers should be footing the bill. From this perspective, it might actually be a good thing for the field to revise its standards, even if (in the worst-case scenario) that causes a short-term contraction in employment.

“But unreliable effects will just fail to replicate, so what’s the big deal?”

This is a surprisingly common defense of sloppy methodology, maybe the single most common one. It’s also an enormous cop-out, since it pre-empts the need to think seriously about what you’re doing in the short term. The idea is that, since no single study is definitive, and a consensus about the reality or magnitude of most effects usually doesn’t develop until many studies have been conducted, it’s reasonable to impose a fairly low bar on initial reports and then wait and see what happens in subsequent replication efforts.

I think this is a nice ideal, but things just don’t seem to work out that way in practice. For one thing, there doesn’t seem to be much of a penalty for publishing high-profile results that later fail to replicate. The reason, I suspect, is that we incline to give researchers the benefit of the doubt: surely (we say to ourselves), Jane Doe did her best, and we like Jane, so why should we question the work she produces? If we’re really so skeptical about her findings, shouldn’t we go replicate them ourselves, or wait for someone else to do it?

While this seems like an agreeable and fair-minded attitude, it isn’t actually a terribly good way to look at things. Granted, if you really did put in your best effort–dotted all your i’s and crossed all your t’s–and still ended up reporting a false result, we shouldn’t punish you for it. I don’t think anyone is seriously suggesting that researchers who inadvertently publish false findings should be ostracized or shunned. On the other hand, it’s not clear why we should continue to celebrate scientists who ‘discover’ interesting effects that later turn out not to replicate. If someone builds a career on the discovery of one or more seemingly important findings, and those findings later turn out to be wrong, the appropriate attitude is to update our beliefs about the merit of that person’s work. As it stands, we rarely seem to do this.

In any case, the bigger problem with appeals to replication is that the delay between initial publication of an exciting finding and subsequent consensus disconfirmation can be very long, and often spans entire careers. Waiting decades for history to prove an influential idea wrong is a very bad idea if the available alternative is to nip the idea in the bud by requiring stronger evidence up front.

There are many notable examples of this in the literature. A well-publicized recent one is John Bargh’s work on the motor effects of priming people with elderly stereotypes–namely, that priming people with words related to old age makes them walk away from the experiment more slowly. Bargh’s original paper was published in 1996, and according to Google Scholar, has now been cited over 2,000 times. It has undoubtedly been hugely influential in directing many psychologists’ research programs in certain directions (in many cases, in directions that are equally counterintuitive and also now seem open to question). And yet it’s taken over 15 years for a consensus to develop that the original effect is at the very least much smaller in magnitude than originally reported, and potentially so small as to be, for all intents and purposes, “not real”. I don’t know who reviewed Bargh’s paper back in 1996, but I suspect that if they ever considered the seemingly implausible size of the effect being reported, they might have well thought to themselves, well, I’m not sure I believe it, but that’s okay–time will tell. Time did tell, of course; but time is kind of lazy, so it took fifteen years for it to tell. In an alternate universe, a reviewer might have said, well, this is a striking finding, but the effect seems implausibly large; I would like you to try to directly replicate it in your lab with a much larger sample first. I recognize that this is onerous and annoying, but my primary responsibility is to ensure that only reliable findings get into the literature, and inconveniencing you seems like a small price to pay. Plus, if the effect is really what you say it is, people will be all the more likely to believe you later on.

Or take the actor-observer asymmetry, which appears in just about every introductory psychology textbook written in the last 20 – 30 years. It states that people are relatively more likely to attribute their own behavior to situational factors, and relatively more likely to attribute other agents’ behaviors to those agents’ dispositions. When I slip and fall, it’s because the floor was wet; when you slip and fall, it’s because you’re dumb and clumsy. This putative asymmetry was introduced and discussed at length in a book by Jones and Nisbett in 1971, and hundreds of studies have investigated it at this point. And yet a 2006 meta-analysis by Malle suggested that the cumulative evidence for the actor-observer asymmetry is actually very weak. There are some specific circumstances under which you might see something like the postulated effect, but what is quite clear is that it’s nowhere near strong enough an effect to justify being routinely invoked by psychologists and even laypeople to explain individual episodes of behavior. Unfortunately, at this point it’s almost impossible to dislodge the actor-observer asymmetry from the psyche of most researchers–a reality underscored by the fact that the Jones and Nisbett book has been cited nearly 3,000 times, whereas the 1996 meta-analysis has been cited only 96 times (a very low rate for an important and well-executed meta-analysis published in Psychological Bulletin).

The fact that it can take many years–whether 15 or 45–for a literature to build up to the point where we’re even in a position to suggest with any confidence that an initially exciting finding could be wrong means that we should be very hesitant to appeal to long-term replication as an arbiter of truth. Replication may be the gold standard in the very long term, but in the short and medium term, appealing to replication is a huge cop-out. If you can see problems with an analysis right now that cast aspersions on a study’s results, it’s an abdication of responsibility to downplay your concerns and wait for someone else to come along and spend a lot more time and money trying to replicate the study. You should point out now why you have concerns. If the authors can address them, the results will look all the better for it. And if the authors can’t address your concerns, well, then, you’ve just done science a service. If it helps, don’t think of it as a matter of saying mean things about someone else’s work, or of asserting your own ego; think of it as potentially preventing a lot of very smart people from wasting a lot of time chasing down garden paths–and also saving a lot of taxpayer money. Remember that our job as scientists is not to make other scientists’ lives easy in the hopes they’ll repay the favor when we submit our own papers; it’s to establish and apply standards that produce convergence on the truth in the shortest amount of time possible.

“But it would hurt my career to be meticulously honest about everything I do!”

Unlike the other considerations listed above, I think the concern that being honest carries a price when it comes to do doing research has a good deal of merit to it. Given the aforementioned delay between initial publication and later disconfirmation of findings (which even in the best case is usually longer than the delay between obtaining a tenure-track position and coming up for tenure), researchers have many incentives to emphasize expediency and good story-telling over accuracy, and it would be disingenuous to suggest otherwise. No malevolence or outright fraud is implied here, mind you; the point is just that if you keep second-guessing and double-checking your analyses, or insist on routinely collecting more data than other researchers might think is necessary, you will very often find that results that could have made a bit of a splash given less rigor are actually not particularly interesting upon careful cross-examination. Which means that researchers who have, shall we say, less of a natural inclination to second-guess, double-check, and cross-examine their own work will, to some degree, be more likely to publish results that make a bit of a splash (it would be nice to believe that pre-publication peer review filters out sloppy work, but empirically, it just ain’t so). So this is a classic tragedy of the commons: what’s good for a given individual, career-wise, is clearly bad for the community as a whole.

I wish I had a good solution to this problem, but I don’t think there are any quick fixes. The long-term solution, as many people have observed, is to restructure the incentives governing scientific research in such a way that individual and communal benefits are directly aligned. Unfortunately, that’s easier said than done. I’ve written a lot both in papers (1, 2, 3) and on this blog (see posts linked here) about various ways we might achieve this kind of realignment, but what’s clear is that it will be a long and difficult process. For the foreseeable future, it will continue to be an understandable though highly lamentable defense to say that the cost of maintaining a career in science is that one sometimes has to play the game the same way everyone else plays the game, even if it’s clear that the rules everyone plays by are detrimental to the communal good.

 

Anyway, this may all sound a bit depressing, but I really don’t think it should be taken as such. Personally I’m actually very optimistic about the prospects for large-scale changes in the way we produce and evaluate science within the next few years. I do think we’re going to collectively figure out how to do science in a way that directly rewards people for employing research practices that are maximally beneficial to the scientific community as a whole. But I also think that for this kind of change to take place, we first need to accept that many of the defenses we routinely give for using iffy methodological practices are just not all that compelling.

on writing: some anecdotal observations, in no particular order

  • Early on in graduate school, I invested in the book “How to Write a Lot“. I enjoyed reading it–mostly because I (mistakenly) enjoyed thinking to myself, “hey, I bet as soon as I finish this book, I’m going to start being super productive!” But I can save you the $9 and tell you there’s really only one take-home point: schedule writing like any other activity, and stick to your schedule no matter what. Though, having said that, I don’t really do that myself. I find I tend to write about 20 hours a week on average. On a very good day, I manage to get a couple of thousand words written, but much more often, I get 200 words written that I then proceed to rewrite furiously and finally trash in frustration. But it all adds up in the long run I guess.
  • Some people are good at writing one thing at a time; they can sit down for a week and crank out a solid draft of a paper without every looking sideways at another project. Personally, unless I have a looming deadline (and I mean a real deadline–more on that below), I find that impossible to do; my general tendency is to work on one writing project for an hour or two, and then switch to something else. Otherwise I pretty much lose my mind. I also find it helps to reward myself–i.e., I’ll work on something I really don’t want to do for an hour, and then play video games for a while switch to writing something more pleasant.
  • I can rarely get any ‘real’ writing (i.e., stuff that leads to publications) done after around 6 pm; late mornings (i.e., right after I wake up) are usually my most productive writing time. And I generally only write for fun (blogging, writing fiction, etc.) after 9 pm. There are exceptions, but by and large that’s my system.
  • I don’t write many drafts. I don’t mean that I never revise papers, because I do–obsessively. But I don’t sit down thinking “I’m going to write a very rough draft, and then I’ll go back and clean up the language.” I sit down thinking “I’m going to write a perfect paper the first time around,” and then I very slowly crank out a draft that’s remarkably far from being perfect. I suspect the former approach is actually the more efficient one, but I can’t bring myself to do it. I hate seeing malformed sentences on the page, even if I know I’m only going to delete them later. It always amazes and impresses me when I get Word documents from collaborators with titles like “AmazingNatureSubmissionVersion18″. I just give my documents all the title “paper_draft”. There might be a V2 or a V3, but there will never, ever be a V18.
  • Papers are not meant to be written linearly. I don’t know anyone who starts with the Introduction, then does the Methods and Results, and then finishes with the Discussion. Personally I don’t even write papers one section at a time. I usually start out by frantically writing down ideas as they pop into my head, and jumping around the document as I think of other things I want to say. I frequently write half a sentence down and then finish it with a bunch of question marks (like so: ???) to indicate I need to come back later and patch it up. Incidentally, this is also why I’m terrified to ever show anyone any of my unfinished paper drafts: an unsuspecting reader would surely come away thinking I suffer from a serious thought disorder. (I suppose they might be right.)
  • Okay, that last point is not entirely true. I don’t write papers completely haphazardly; I do tend to write Methods and Results before Intro and Discussion. I gather that this is a pretty common approach. On the rare occasions when I’ve started writing the Introduction first, I’ve invariably ended up having to completely rewrite it, because it usually turns out the results aren’t actually what I thought they were.
  • My sense is that most academics get more comfortable writing as time goes on. Relatively few grad students have the perseverance to rapidly crank out publication-worthy papers from day 1 (I was definitely not one of them). I don’t think this is just a matter of practice; I suspect part of it is a natural maturation process. People generally get more conscientious as they age; it stands to reason that writing (as an activity most people find unpleasant) should get easier too. I’m better at motivating myself to write papers now, but I’m also much better about doing the dishes and laundry–and I’m pretty sure that’s not because practice makes dishwashing perfect.
  • When I started grad school, I was pretty sure I’d never publish anything, let alone graduate, because I’d never handed in a paper as an undergraduate that wasn’t written at the last minute, whereas in academia, there are virtually no hard deadlines (see below). I’m not sure exactly what changed. I’m still continually surprised every time something I wrote gets published. And I often catch myself telling myself, “hey, self, how the hell did you ever manage to pay attention long enough to write 5,000 words?” And then I reply to myself, “well, self, since you ask, I took a lot of stimulants.”
  • I pace around a lot when I write. A lot. To the point where my labmates–who are all uncommonly nice people–start shooting death glares my way. It’s a heritable tendency, I guess (the pacing, not the death glare attraction); my father also used to pace obsessively. I’m not sure what the biological explanation for it is. My best guess is it’s an arousal-mediated effect: I can think pretty well when I’m around other people, or when I’m in motion, but if I’m sitting at a desk and I don’t already know exactly what I want to say, I can’t get anything done. I generally pace around the lab or house for a while figuring out what I want to say, and then I sit down and write until I’ve forgotten what I want to say, or decide I didn’t really want to say that after all. In practice this usually works out to 10 minutes of pacing for every 5 minutes of writing. I envy people who can just sit down and calmly write for two or three hours without interruption (though I don’t think there are that many of them). At the same time, I’m pretty sure I burn a lot of calories this way.
  • I’ve been pleasantly surprised to discover that I much prefer writing grant proposals to writing papers–to the point where I actually enjoy writing grant proposals. I suspect the main reason for this is that grant proposals have a kind of openness that papers don’t; with a paper, you’re constrained to telling the story the data actually support, whereas a grant proposal is as good as your vision of what’s possible (okay, and plausible). A second part of it is probably the novelty of discovery: once you conduct your analyses, all that’s left is to tell other people what you found, which (to me) isn’t so exciting. I mean, I already think I know what’s going on; what do I care if you know? Whereas when writing a grant, a big part of the appeal for me is that I could actually go out and discover new stuff–just as long as I can convince someone to give me some money first.
  • At a a departmental seminar attended by about 30 people, I once heard a student express concern about an in-progress review article that he and several of the other people at the seminar were collaboratively working on. The concern was that if all of the collaborators couldn’t agree on what was going to go in the paper (and they didn’t seem to be able to at that point), the paper wouldn’t get written in time to make the rapidly approaching deadline dictated by the journal editor. A senior and very brilliant professor responded to the student’s concern by pointing out that this couldn’t possibly be a real problem seeing as in reality there is actually no such thing as a hard writing deadline. This observation didn’t go over so well with some of the other senior professors, who weren’t thrilled that their students were being handed the key to the kingdom of academic procrastination so early in their careers. But it was true, of course: with the major exception of grant proposals (EDIT: and as Garrett points out in the comments below, conference publications in disciplines like Computer Science), most of the things academics write (journal articles, reviews, commentaries, book chapters, etc.) operate on a very flexible schedule. Usually when someone asks you to write something for them, there is some vague mention somewhere of some theoretical deadline, which is typically a date that seems so amazingly far off into the future that you wonder if you’ll even be the same person when it rolls around. And then, much to your surprise, the deadline rolls around and you realize that you must in fact really bea different person, because you don’t seem to have any real desire to work on this thing you signed up for, and instead of writing it, why don’t you just ask the editor for an extension while you go rustle up some motivation. So you send a polite email, and the editor grudgingly says, “well, hmm, okay, you can have another two weeks,” to which you smile and nod sagely, and then, two weeks later, you send another similarly worded but even more obsequious email that starts with the words “so, about that extension…”

    The basic point here is that there’s an interesting dilemma: even though there rarely are any strict writing deadlines, it’s to almost everyone’s benefit to pretend they exist. If I ever find out that the true deadline (insofar as such a thing exists) for the chapter I’m working on right now is 6 months from now and not 3 months ago (which is what they told me), I’ll probably relax and stop working on it for, say, the next 5 and a half months. I sometimes think that the most productive academics are the ones who are just really really good at repeatedly lying to themselves.

  • I’m a big believer in structured procrastination when it comes to writing. I try to always have a really unpleasant but not-so-important task in the background, which then forces me to work on only-slightly-unpleasant-but-often-more-important tasks. Except it often turns out that the unpleasant-but-no-so-important task is actually an unpleasant-but-really-important task after all, and then I wake up in a cold sweat in the middle of the night thinking of all the ways I’ve screwed myself over. No, just kidding. I just bitch about it to my wife for a while and then drown my sorrows in an extra helping of ice cream.
  • I’m really, really, bad at restarting projects I’ve put on the back burner for a while. Right now there are 3 or 4 papers I’ve been working on on-and-off for 3 or 4 years, and every time I pick them up, I write a couple of hundred words and then put them away for a couple of months. I guess what I’m saying is that if you ever have the misfortune of collaborating on a paper with me, you should make sure to nag me several times a week until I get so fed up with you I sit down and write the damn paper. Otherwise it may never see the light of day.
  • I like writing fiction in my spare time. I also occasionally write whiny songs. I’m pretty terrible at both of these things, but I enjoy them, and I’m told (though I don’t believe it for a second) that that’s the important thing.

in praise of self-policing

It’s IRB week over at The Hardest Science; Sanjay has an excellent series of posts (1, 2, 3) discussing some proposed federal rule changes to the way IRBs oversee research. The short of it is that the proposed changes are mostly good news for people who do minimal risk-type research with human subjects (i.e., stuff that doesn’t involve poking people with needles); if the changes pass as written, most of us will no longer have to file any documents with our IRBs before running our studies. We’ll just put in a short note saying we’ve determined that our studies are excused from review, and then we can start collecting data right away. It’ll work something like this*:

This doesn’t mean federal oversight of human subjects research will cease, of course. There will still be guidelines we all have to follow. But instead of making researchers jump through flaming hoops preemptively, enforcement will take place on an ad-hoc basis and via random audits. For the most part, the important decisions will be left to investigators rather than IRBs. For more details, see Sanjay’s excellent breakdown.

I also agree with Sanjay’s sentiment in his latest post that this is the right way to do things; researchers should police themselves, rather than employing an entire staff of people whose jobs it is to tell researchers how to safely and ethically do their research. In principle, the idea of having trained IRB analysts go over every study sounds nice; the problem is that it takes a very long time, generates a lot of extra work for everyone, and perhaps most problematically, sets up all sorts of perverse incentives. Namely, IRB analysts have an incentive to be pedantic (since they rarely lose their jobs if they ask for too much detail, but could be liable if they give too much leeway and something bad happens), and investigators have an incentive to off-load their conscience onto the IRB rather than actually having to think about the impact of their experiment on subjects. I catch myself doing this more often than I’d like, and I’m not really happy about it. (For instance, I recently found myself telling someone it was okay for them to present gruesome pictures to subjects “because the IRB doesn’t mind that”, and not because I thought the psychological impact was negligible. I gave myself twenty lashes for that one**.) I suspect that, aside from saving everyone a good deal of time and effort, placing the responsibility of doing research on researchers’ shoulders would actually lead them to give more, and not less, consideration to ethical issues.

Anyway, it remains to be seen whether the proposed rules actually pass in their current form. One of the interesting features of the situation is that IRBs may now perversely actually have an incentive to fight against these rules going into effect, since they’d almost certainly need to lay off staff if we move to a system where most studies are entirely excused from review. I don’t really think that this will be much of an issue, and on balance I’m sure university administrations recognize how much IRBs slow down research; but it still can’t hurt for those of us who do research with human subjects to stick our heads past the Department of Health and Human Service’s doors and affirm that excusing most non-invasive human subjects research from review is the right thing to do.


* I know, I know. I managed to go two whole years on this blog without a single lolcat appearance, and now I throw it all away for this. Sorry.

** With a feather duster.

CNS 2011: a first-person shorthand account in the manner of Rocky Steps

Friday, April 1

4 pm. Arrive at SFO International on bumpy flight from Denver.

4:45 pm. Approach well-dressed man downtown and open mouth to ask for directions to Hyatt Regency San Francisco. “Sorry,” says well-dressed man, “No change to give.” Back off slowly, swinging bags, beard, and poster tube wildly, mumbling “I’m not a panhandler, I’m a neuroscientist.” Realize that difference between the two may be smaller than initially suspected.

6:30 pm. Hear loud knocking on hotel room door. Open door to find roommate. Say hello to roommate. Realize roommate is extremely drunk from East Coast flight. Offer roommate bag of coffee and orange tic-tacs. Roommate is confused, asks, “are you drunk?” Ignore roommate’s question. “You’re drunk, aren’t you.” Deny roommate’s unsubstantiated accusations. “When you write about this on your blog, you better not try to make it look like I’m the drunk one,” roommate says. Resolve to ignore roommate’s crazy talk for next 4 days.

6:45 pm. Attempt to open window of 10th floor hotel room in order to procure fresh air for face. Window refuses to open. Commence nudging of, screaming at, and bargaining with window. Window still refuses to open. Roommate points out sticker saying window does not open. Ignore sticker, continue berating window. Window still refuses to open, but now has low self-esteem.

8 pm. Have romantic candlelight dinner at expensive french restaurant with roommate. Make jokes all evening about ideal location (San Francisco) for start of new intimate relationship. Suspect roommate is uncomfortable, but persist in faux wooing. Roommate finally turns tables by offering to put out. Experience heightened level of discomfort, but still finish all of steak tartare and order creme brulee. Dessert appetite is immune to off-color humor!

11 pm – 1 am. Grand tour of seedy SF bars with roommate and old grad school friend. New nightlife low: denied entrance to seedy dance club because shoes insufficiently classy. Stupid Teva sandals.

Saturday, April 2

9:30 am. Wake up late. Contemplate running downstairs to check out ongoing special symposium for famous person who does important research. Decide against. Contemplate visiting hotel gym to work off creme brulee from last night. Decide against. Contemplate reading conference program in bed and circling interesting posters to attend. Decide against. Contemplate going back to sleep. Consult with self, make unanimous decision in favor.

1 pm. Have extended lunch meeting with collaborators at Ferry Building to discuss incipient top-secret research project involving diesel generator, overstock beanie babies, and apple core. Already giving away too much!

3:30 pm. Return to hotel. Discover hotel is now swarming with name badges attached to vaguely familiar faces. Hug vaguely familiar faces. Hugs are met with startled cries. Realize that vaguely familiar faces are actually completely unfamiliar faces. Wrong conference: Young Republicans, not Cognitive Neuroscientists. Make beeline for elevator bank, pursued by angry middle-aged men dressed in American flags.

5 pm. Poster session A! The sights! The sounds! The lone free drink at the reception! The wonders of yellow 8-point text on black 6′ x 4′ background! Too hard to pick a favorite thing, not even going to try. Okay, fine: free schwag at the exhibitor stands.

5 pm – 7 pm. Chat with old friends. Have good time catching up. Only non-fictionalized bullet point of entire piece.

8 pm. Dinner at belly dancing restaurant in lower Haight. Great conversation, good food, mediocre dancing. Towards end of night, insist on demonstrating own prowess in fine art of torso shaking; climb on table and gyrate body wildly, alternately singing Oompa-Loompa song and yelling “get in my belly!” at other restaurant patrons. Nobody tips.

12:30 am. Take the last train to Clarksville. Take last N train back to Hyatt Regency hotel.

Sunday, April 3

7 am. Wake up with amazing lack of hangover. Celebrate amazing lack of hangover by running repeated victory laps around 10th floor of Hyatt Regency, Rocky Steps style. Quickly realize initial estimate of hangover absence off by order of magnitude. Revise estimate; collapse in puddle on hotel room floor. Refuse to move until first morning session.

8:15 am. Wander the eight Caltech aisles of morning poster session in search of breakfast. Fascinating stuff, but this early in morning, only value signals of interest are smell and sight of coffee, muffins, and bagels.

10 am. Terrific symposium includes excellent talks about emotion, brain-body communication, and motivation, but favorite moment is still when friend arrives carrying bucket of aspirin.

1 pm. Bump into old grad school friend outside; decide to grab lunch on pier behind Ferry Building. Discuss anterograde amnesia and dating habits of mutual friends. Chicken and tofu cake is delicious. Sun is out, temperature is mild; perfect day to not attend poster sessions.

1:15 – 2 pm. Attend poster session.

2 pm – 5 pm. Presenting poster in 3 hours! Have full-blown panic attack in hotel room. Not about poster, about General Hospital. Why won’t Lulu take Dante’s advice and call support group number for alcoholics’ families?!?! Alcohol is Luke’s problem, Lulu! Call that number!

5 pm. Present world’s most amazing poster to three people. Launch into well-rehearsed speech about importance of work and great glory of sophisticated technical methodology before realizing two out of three people are mistakenly there for coffee and cake, and third person mistook presenter for someone famous. Pause to allow audience to mumble excuses and run to coffee bar. When coast is clear, resume glaring at anyone who dares to traverse poster aisle. Believe strongly in marking one’s territory.

8 pm. Lab dinner at House of Nanking. Food is excellent, despite unreasonably low tablespace-to-floorspace ratio. Conversation revolves around fainting goats, ‘relaxation’ in Thailand, and, occasionally, science.

10 pm. Karaoke at The Mint. Compare performance of CNS attendees with control group of regulars; establish presence of robust negative correlation between years of education and singing ability. Completely wreck voice performing whitest rendition ever of Shaggy’s “Oh Carolina”. Crowd jeers. No, wait, crowd gyrates. In wholesome scientific manner. Crowd is composed entirely of people with low self-monitoring skills; what luck! DJ grimaces through entire song and most of previous and subsequent songs.

2 am. Take cab back to hotel with graduate students and Memory Professor. Memory Professor is drunk; manages to nearly fall out of cab while cab in motion. In-cab conversation revolves around merits of dynamic programming languages. No consensus reached, but civility maintained. Arrival at hotel: all cab inhabitants below professorial rank immediately slip out of cab and head for elevators, leaving Memory Professor to settle bill. In elevator, Graduate Student A suggests that attempt to push Memory Professor out of moving cab was bad idea in view of Graduate Student A’s impending post-doc with Memory Professor. Acknowledge probable wisdom of Graduate Student A’s observation while simultaneously resolving to not adjust own degenerate behavior in the slightest.

2:15 am. Drink at least 24 ounces of water before attaining horizontal position. Fall asleep humming bars of Elliott Smith’s Angeles. Wrong city, but close enough.

Monday, April 4

8 am. Wake up hangover free again! For real this time. No Rocky Steps dance. Shower and brush teeth. Delicately stroke roommate’s cheek (he’ll never know) before heading downstairs for poster session.

8:30 am. Bagels, muffin, coffee. Not necessarily in that order.

9 am – 12 pm. Skip sessions, spend morning in hotel room working. While trying to write next section of grant proposal, experience strange sensation of time looping back on itself, like a snake eating its own tail, but also eating grant proposal at same time. Awake from unexpected nap with ‘Innovation’ section in mouth.

12:30 pm. Skip lunch; for some reason, not very hungry.

1 pm. Visit poster with screaming purple title saying “COME HERE FOR FREE CHOCOLATE.” Am impressed with poster title and poster, but disappointed by free chocolate selection: Dove eggs and purple Hershey’s kisses–worst chocolate in the world! Resolve to show annoyance by disrupting presenter’s attempts to maintain conversation with audience. Quickly knocked out by chocolate eggs thrown by presenter.

5 pm. Wake up in hotel room with headache and no recollection of day’s events. Virus or hangover? Unclear. For some reason, hair smells like chocolate.

7:30 pm. Dinner at Ferry Building with Brain Camp friends. Have now visited Ferry Building at least one hundred times in seventy-two hours. Am now compulsively visiting Ferry Building every fifteen minutes just to feel normal.

9:30 pm. Party at Americano Restaurant & Bar for Young Investigator Award winner. Award comes with $500 and strict instructions to be spent on drinks for total strangers. Strange tradition, but noone complains.

11 pm. Bar is crowded with neuroscientists having great time at Young Investigator’s expense.

11:15 pm. Drink budget runs out.

11:17 pm. Neuroscientists mysteriously vanish.

1 am. Stroll through San Francisco streets in search of drink. Three false alarms, but finally arrive at open pub 10 minutes before last call. Have extended debate with friend over whether hotel room can be called ‘home’. Am decidedly in No camp; ‘home’ is for long-standing attachments, not 4-day hotel hobo runs.

2 am. Walk home.

Tuesday, April 5

9:05 am. Show up 5 minutes late for bagels and muffins. All gone! Experience Apocalypse Now moment on inside, but manage not to show it–except for lone tear. Drown sorrows in Tazo Wild Sweet Orange tea. Tea completely fails to live up to name; experience second, smaller, Apocalypse Now moment. Roommate walks over and asks if everything okay, then gently strokes cheek and brushes away lone tear (he knew!!!).

9:10 – 1 pm. Intermittently visit poster and symposium halls. Not sure why. Must be force of habit learning system.

1:30 pm. Lunch with friends at Thai restaurant near Golden Gate Park. Fill belly up with coconut, noodles, and crab. About to get on table to express gratitude with belly dance, but notice that friends have suddenly disappeared.

2 – 5 pm. Roam around Golden Gate Park and Haight-Ashbury. Stop at Whole Foods for friend to use bathroom. Get chased out of Whole Foods for using bathroom without permission. Very exciting; first time feeling alive on entire trip! Continue down Haight. Discuss socks, ice cream addiction (no such thing), and funding situation in Europe. Turns out it sucks there too.

5:15 pm. Take BART to airport with lab members. Watch San Francisco recede behind train. Sink into slightly melancholic state, but recognize change of scenery is for the best: constitution couldn’t handle more Rocky Steps mornings.

7:55 pm. Suddenly rediscover pronouns as airplane peels away from gate.

8 pm PST – 11:20 MST. The flight’s almost completely empty; I get to stretch out across the entire emergency exit aisle. The sun goes down as we cross the Sierra Nevada; the last of the ice in my cup melts into water somewhere between Provo and Grand Junction. As we start our descent into Denver, the lights come out in force, and I find myself preemptively bored at the thought of the long shuttle ride home. For a moment, I wish I was back in my room at the Hyatt at 8 am–about to run Rocky Steps around the hotel, or head down to the poster hall to find someone to chat with over a bagel and coffee. For some reason, I still feel like I didn’t get quite enough time to hang out with all the people I wanted to see, despite barely sleeping in 4 days. But then sanity returns, and the thought quickly passes.

what Paul Meehl might say about graduate school admissions

Sanjay Srivastava has an excellent post up today discussing the common belief among many academics (or at least psychologists) that graduate school admission interviews aren’t very predictive of actual success, and should be assigned little or no weight when making admissions decisions:

The argument usually goes something like this: “All the evidence from personnel selection studies says that interviews don’t predict anything. We are wasting people’s time and money by interviewing grad students, and we are possibly making our decisions worse by substituting bad information for good.”

I have been hearing more or less that same thing for years, starting when I was grad school myself. In fact, I have heard it often enough that, not being familiar with the literature myself, I accepted what people were saying at face value. But I finally got curious about what the literature actually says, so I looked it up.

I confess that I must have been drinking from the kool-aid spigot, because until I read Sanjay’s post, I’d long believed something very much like this myself, and for much the same reason. I’d never bothered to actually, you know, look at the data myself. Turns out the evidence and the kool-aid are not compatible:

A little Google Scholaring for terms like “employment interviews” and “incremental validity” led me to a bunch of meta-analyses that concluded that in fact interviews can and do provide useful information above and beyond other valid sources of information (like cognitive ability tests, work sample tests, conscientiousness, etc.). One of the most heavily cited is a 1998 Psych Bulletin paper by Schmidt and Hunter (link is a pdf; it’s also discussed in this blog post). Another was this paper by Cortina et al, which makes finer distinctions among different kinds of interviews. The meta-analyses generally seem to agree that (a) interviews correlate with job performance assessments and other criterion measures, (b) interviews aren’t as strong predictors as cognitive ability, (c) but they do provide incremental (non-overlapping) information, and (d) in those meta-analyses that make distinctions between different kinds of interviews, structured interviews are better than unstructured interviews.

This seems entirely reasonable, and I agree with Sanjay that it clearly shows that admissions interviews aren’t useless, at least in an actuarial sense. That said, after thinking about it for a while, I’m not sure these findings really address the central question admissions committees care about. When deciding which candidates to admit as students, the relevant question isn’t really what factors predict success in graduate school?, it’s what factors should the admissions committee attend to when making a decision? These may seem like the same thing, but they’re not. And the reason they’re not is that knowing which factors are predictive of success is no guarantee that faculty are actually going to be able to use that information in an appropriate way. Knowing what predicts performance is only half the story, as it were; you also need to know exactly how to weight different factors appropriately in order to generate an optimal prediction.

In practice, humans turn out to be incredibly bad at predicting outcomes based on multiple factors. An enormous literature on mechanical (or actuarial) prediction, which Sanjay mentions in his post, has repeatedly demonstrated that in many domains, human judgments are consistently and often substantially outperformed by simple regression equations. There are several reasons for this gap, but one of the biggest ones is that people are just shitty at quantitatively integrating multiple continuous variables. When you visit a car dealership, you may very well be aware that your long-term satisfaction with any purchase is likely to depend on some combination of horsepower, handling, gas mileage, seating comfort, number of cupholders, and so on. But the odds that you’ll actually be able to combine that information in an optimal way are essentially nil. Our brains are simply not designed to work that way; you can’t internally compute the value you’d get out of a car using an equation like 1.03*cupholders + 0.021*horsepower + 0.3*mileage. Some of us try to do it that way–e.g., by making very long pro and con lists detailing all the relevant factors we can possibly think of–but it tends not to work out very well (e.g., you total up the numbers and realize, hey, that’s not the answer I wanted! And then you go buy that antique ’68 Cadillac you had your eye on the whole time you were pretending to count cupholders in the Nissan Maxima).

Admissions committees face much the same problem. The trouble lies not so much in determining which factors predict graduate school success (or, for that matter, many other outcomes we care about in daily life), but in determining how to best combine them. Knowing that interview performance incrementally improves predictions is only useful if you can actually trust decision-makers to weight that variable very lightly relative to other more meaningful predictors like GREs and GPAs. And that’s a difficult proposition, because I suspect that admissions discussions rarely go like this:

Faculty Member 1: I think we should accept Candidate X. Her GREs are off the chart, great GPA, already has two publications.
Faculty Member 2: I didn’t like X at all. She didn’t seem very excited to be here.
FM1: Well, that doesn’t matter so much. Unless you really got a strong feeling that she wouldn’t stick it out in the program, it probably won’t make much of a difference, performance-wise.
FM2: Okay, fine, we’ll accept her.

And more often go like this:

FM1: Let’s take Candidate X. Her GREs are off the chart, great GPA, already has two publications.
FM2: I didn’t like X at all. She didn’t seem very excited to be here.
FM1: Oh, you thought so too? That’s kind of how I felt too, but I didn’t want to say anything.
FM2: Okay, we won’t accept X. We have plenty of other good candidates with numbers that are nearly as good and who seemed more pleasant.

Admittedly, I don’t have any direct evidence to back up this conjecture. Except that I think it would be pretty remarkable if academic faculty departed from experts in pretty much every other domain that’s been tested (clinical practice, medical diagnosis, criminal recidivism, etc.) and were actually able to do as well (or even close to as well) as a simple regression equation. For what it’s worth, in many of the studies of mechanical prediction, the human experts are explicitly given all of the information passed to the prediction equation, and still do relatively poorly. In other words, you can hand a clinical psychologist a folder full of quantitative information about a patient, tell them to weight it however they want, and even the best clinicians are still going to be outperformed by a mechanical prediction (if you doubt this to be true, I second Sanjay in directing you to Paul Meehl’s seminal body of work–truly some of the most important and elegant work ever done in psychology, and if you haven’t read it, you’re missing out). And in some sense, faculty members aren’t really even experts about admissions, since they only do it once a year. So I’m pretty skeptical that admissions committees actually manage to weight their firsthand personal experience with candidates appropriately when making their final decisions. It seems much more likely that any personality impressions they come away with will just tend to drown out prior assessments based on (relatively) objective data.

That all said, I couldn’t agree more with Sanjay’s ultimate conclusion, so I’ll just end with this quote:

That, of course, is a testable question. So if you are an evidence-based curmudgeon, you should probably want some relevant data. I was not able to find any studies that specifically addressed the importance of rapport and interest-matching as predictors of later performance in a doctoral program. (Indeed, validity studies of graduate admissions are few and far between, and the ones I could find were mostly for medical school and MBA programs, which are very different from research-oriented Ph.D. programs.) It would be worth doing such studies, but not easy.

Oh, except that I do want to add that I really like the phrase “evidence-based curmudgeon“, and I’m totally stealing it.

will trade two Methods sections for twenty-two subjects worth of data

The excellent and ever-candid Candid Engineer in Academia has an interesting post discussing the love-hate relationship many scientists who work in wet labs have with benchwork. She compares two very different perspectives:

She [a current student] then went on to say that, despite wanting to go to grad school, she is pretty sure she doesn’t want to continue in academia beyond the Ph.D. because she just loves doing the science so much and she can’t imagine ever not being at the bench.

Being young and into the benchwork, I remember once asking my grad advisor if he missed doing experiments. His response: “Hell no.” I didn’t understand it at the time, but now I do. So I wonder if my student will always feel the way she does now- possessing of that unbridled passion for the pipet, that unquenchable thirst for the cell culture hood.

Wet labs are pretty much nonexistent in psychology–I’ve never had to put on gloves or goggles to do anything that I’d consider an “experiment”, and I’ve certainly never run the risk of  spilling dangerous chemicals all over myself–so I have no opinion at all about benchwork. Maybe I’d love it, maybe I’d hate it; I couldn’t tell you. But Candid Engineer’s post did get me thinking about opinions surrounding the psychological equivalent of benchwork–namely, collecting data form human subjects. My sense is that there’s somewhat more consensus among psychologists, in that most of us don’t seem to like data collection very much. But there are plenty of exceptions, and there certainly are strong feelings on both sides.

More generally, I’m perpetually amazed at the wide range of opinions people can hold about the various elements of scientific research, even when the people doing the different-opinion-holding all work in very similar domains. For instance, my favorite aspect of the research I do, hands down, is data analysis. I’d be ecstatic if I could analyze data all day and never have to worry about actually communicating the results to anyone (though I enjoy doing that too). After that, there are activities like writing and software development, which I spend a lot of time doing, and occasionally enjoy, but also frequently find very frustrating. And then, at the other end, there are aspects of research that I find have little redeeming value save for their instrumental value in supporting other, more pleasant, activities–nasty, evil activities like writing IRB proposals and, yes, collecting data.

To me, collecting data is something you do because you’re fundamentally interested in some deep (or maybe not so deep) question about how the mind works, and the only way to get an answer is to actually interrogate people while they do stuff in a controlled environment. It isn’t something I do for fun. Yet I know people who genuinely seem to love collecting data–or, for that matter, writing Methods sections or designing new experiments–even as they loathe perfectly pleasant activities like, say, sitting down to analyze the data they’ve collected, or writing a few lines of code that could save them hours’ worth of manual data entry. On a personal level, I find this almost incomprehensible: how could anyone possibly enjoy collecting data more than actually crunching the numbers and learning new things? But I know these people exist, because I’ve talked to them. And I recognize that, from their perspective, I’m the guy with the strange views. They’re sitting there thinking: what kind of joker actually likes to turn his data inside out several dozen times? What’s wrong with just running a simple t-test and writing up the results as fast as possible, so you can get back to the pleasure of designing and running new experiments?

This of course leads us directly to the care bears fucking tea party moment where I tell you how wonderful it is that we all have these different likes and dislikes. I’m not being sarcastic; it really is great. Ultimately, it works to everyone’s advantage that we enjoy different things, because it means we get to collaborate on projects and take advantage of complementary strengths and interests, instead of all having to fight over who gets to write the same part of the Methods section. It’s good that there are some people who love benchwork and some people who hate it, and it’s good that there are people who’re happy to write software that other people who hate writing software can use. We don’t all have to pretend we understand each other; it’s enough just to nod and smile and say “but of course you can write the Methods for that paper; I really don’t mind. And yes, I guess I can run some additional analyses for you, really, it’s not too much trouble at all.”

academic bloggers on blogging

Is it wise for academics to blog? Depends on who you ask. Scott Sumner summarizes his first year of blogging this way:

Be careful what you wish for.  Last February 2nd I started this blog with very low expectations.  During the first three weeks most of the comments were from Aaron Jackson and Bill Woolsey.  I knew I wasn’t a good writer, years ago I got a referee report back from an anonymous referee (named McCloskey) who said “if the author had used no commas at all, his use of commas would have been more nearly correct.”  Ouch!  But it was true, others said similar things.  And I was also pretty sure that the content was not of much interest to anyone.

Now my biggest problem is time—I spend 6 to 10 hours a day on the blog, seven days a week.  Several hours are spent responding to reader comments and the rest is spent writing long-winded posts and checking other economics blogs.  And I still miss many blogs that I feel I should be reading. …

Regrets?  I’m pretty fatalistic about things.  I suppose it wasn’t a smart career move to spend so much time on the blog.  If I had ignored my commenters I could have had my manuscript revised by now. …  And I really don’t get any support from Bentley, as far as I know the higher ups don’t even know I have a blog. So I just did 2500 hours of uncompensated labor.

I don’t think Sumner actually regrets blogging (as the rest of his excellent post makes clear), but he does seem to think it’s hurt him professionally in some ways–most notably, because of all the time he spends blogging that he could be doing something else (like revising that manuscript).

Andrew Gelman has a very different take:

I agree with Sethi that Sumner’s post is interesting and captures much of the blogging experience. But I don’t agree with that last bit about it being a bad career move. Or perhaps Sumner was kidding? (It’s notoriously difficult to convey intonation in typed speech.) What exactly is the marginal value of his having a manuscript revised? It’s not like Bentley would be compensating him for that either, right? For someone like Sumner (or, for that matter, Alex Tabarrok or Tyler Cowen or my Columbia colleague Peter Woit), blogging would seem to be an excellent career move, both by giving them and their ideas much wider exposure than they otherwise would’ve had, and also (as Sumner himself notes) by being a convenient way to generate many thousands of words that can be later reworked into a book. This is particularly true of Sumner (more than Tabarrok or Cowen or, for that matter, me) because he tends to write long posts on common themes. (Rajiv Sethi, too, might be able to put together a book or some coherent articles by tying together his recent blog entries.)

Blogging and careers, blogging and careers . . . is blogging ever really bad for an academic career? I don’t know. I imagine that some academics spend lots of time on blogs that nobody reads, and that could definitely be bad for their careers in an opportunity-cost sort of way. Others such as Steven Levitt or Dan Ariely blog in an often-interesting but sometimes careless sort of way. This might be bad for their careers, but quite possibly they’ve reached a level of fame in which this sort of thing can’t really hurt them anymore. And this is fine; such researchers can make useful contributions with their speculations and let the Gelmans and Fungs of the world clean up after them. We each have our role in this food web. … And then of course there are the many many bloggers, academic and otherwise, whose work I assume I would’ve encountered much more rarely were they not blogging.

My own experience falls much more in line with Gelman’s here; my blogging experience has been almost wholly positive. Some of the benefits I’ve found to blogging regularly:

  • I’ve had many interesting email exchanges with people that started via a comment on something I wrote, and some of these will likely turn into collaborations at some point in the future.
  • I’ve been exposed to lots of interesting things (journal articles, blog posts, datasets, you name it) I wouldn’t have come across otherwise–either via links left in comments or sent by email, or while rooting around the web for things to write about.
  • I’ve gotten to publicize and promote my own research, which is always nice. As Gelman points out, it’s easier to learn about other people’s work if those people are actively blogging about it. I think that’s particularly true for people who are just starting out their careers.
  • I think blogging has improved both my ability and my willingness to write. By nature, I don’t actually like writing very much, and (like most academics I know) I find writing journal articles particularly unpleasant. Forcing myself to blog (semi-)regularly has instilled a certain discipline about writing that I haven’t always had, and if nothing else, it’s good practice.
  • I get to share ideas and findings I find interesting and/or important with other people. This is already what most academics do over drinks at conferences (and I think it’s a really important part of science), and blogging seems like a pretty natural extension.

All this isn’t to say that there aren’t any potential drawbacks to blogging. I think there are at least two important ones. One is the obvious point that, unless you’re blogging anonymously, it’s probably unwise to say things online that you wouldn’t feel comfortable saying in person. So, despite being a class-A jackass pretty critical by nature, I try to discuss things I like as often as things I don’t like–and to keep the tone constructive whenever I do the latter.

The other potential drawback, which both Sumner and Gelman allude to, is the opportunity cost. If you’re spending half of your daylight hours blogging, there’s no question it’s going to have an impact on your academic productivity. But in practice, I don’t think blogging too much is a problem many academic bloggers have. I usually find myself wishing most of the bloggers I read posted more often. In my own case, I almost exclusively blog after around 9 or 10 pm, when I’m no longer capable of doing sustained work on manuscripts anyway (I’m generally at my peak in the late morning and early afternoon). So, for me, blogging has replaced about ten hours a week of book reading/TV watching/web surfing, while leaving the amount of “real” work I do largely unchanged. That’s not really much of a cost, and I might even classify it as another benefit. With the admittedly important caveat that watching less television has made me undeniably useless at trivia night.