whether or not you should pursue a career in science still depends mostly on that thing that is you

I took the plunge a couple of days ago and answered my first question on Quora. Since Brad Voytek won’t shut up about how great Quora is, I figured I should give it a whirl. So far, Brad is not wrong.

The question in question is: “How much do you agree with Johnathan Katz’s advice on (not) choosing science as a career? Or how realistic is it today (the article was written in 1999)?” The Katz piece referred to is here. The gist of it should be familiar to many academics; the argument boils down to the observation that relatively few people who start graduate programs in science actually end up with permanent research positions, and even then, the need to obtain funding often crowds out the time one has to do actual science. Katz’s advice is basically: don’t pursue a career in science. It’s not an optimistic piece.

My answer is, I think, somewhat more optimistic. Here’s the full text:

The real question is what you think it means to be a scientist. Science differs from many other professions in that the typical process of training as a scientist–i.e., getting a Ph.D. in a scientific field from a major research university–doesn’t guarantee you a position among the ranks of the people who are training you. In fact, it doesn’t come close to guaranteeing it; the proportion of PhD graduates in science who go on to obtain tenure-track positions at research-intensive universities is very small–around 10% in most recent estimates. So there is a very real sense in which modern academic science is a bit of a pyramid scheme: there are a relatively small number of people at the top, and a lot of people on the rungs below laboring to get up to the top–most of whom will, by definition, fail to get there.

If you equate a career in science solely with a tenure-track position at a major research university, and are considering the prospect of a Ph.D. in science solely as an investment intended to secure that kind of position, then Katz’s conclusion is difficult to escape. He is, in most respects, correct: in most biomedical, social, and natural science fields, science is now an extremely competitive enterprise. Not everyone makes it through the PhD; of those who do, not everyone makes it into–and then through–one more more postdocs; and of those who do that, relatively few secure tenure-track positions. Then, of those few “lucky” ones, some will fail to get tenure, and many others will find themselves spending much or most of their time writing grants and managing people instead of actually doing science. So from that perspective, Katz is probably right: if what you mean when you say you want to become a scientist is that you want to run your own lab at a major research university, then your odds of achieving that at the outset are probably not very good (though, to be clear, they’re still undoubtedly better than your odds of becoming a successful artist, musician, or professional athlete). Unless you have really, really good reasons to think that you’re particularly brilliant, hard-working, and creative (note: undergraduate grades, casual feedback from family and friends, and your own internal gut sense do not qualify as really, really good reasons), you probably should not pursue a career in science.

But that’s only true given a rather narrow conception where your pursuit of a scientific career is motivated entirely by the end goal rather than by the process, and where failure is anything other than ending up with a permanent tenure-track position. By contrast, if what you’re really after is an environment in which you can pursue interesting questions in a rigorous way, surrounded by brilliant minds who share your interests, and with more freedom than you might find at a typical 9 to 5 job, the dream of being a scientist is certainly still alive, and is worth pursuing. The trivial demonstration of this is that if you’re one of the many people who actuallyenjoy the graduate school environment (yes, they do exist!), it may not even matter to you that much whether or not you have a good shot of getting a tenure-track position when you graduate.

To see this, imagine that you’ve just graduated with an undergraduate degree in science, and someone offers you a choice between two positions for the next six years. One position is (relatively) financially secure, but involves rather boring work of quesitonable utility to society, an inflexible schedule, and colleagues who are mostly only there for a paycheck. The other position has terrible pay, but offers fascinating and potentially important work, a flexible lifestyle, and colleagues who are there because they share your interests and want to do scientific research.

Admittedly, real-world choices are rarely this stark. Many non-academic jobs offer many of the same perceived benefits of academia (e.g., many tech jobs offer excellent working conditions, flexible schedules, and important work). Conversely, many academic environments don’t quite live up to the ideal of a place where you can go to pursue your intellectual passion unfettered by the annoyances of “real” jobs–there’s often just as much in the way of political intrigue, personality dysfunction, and menial due-paying duties. But to a first approximation, this is basically the choice you have when considering whether to go to graduate school in science or pursue some other career: you’re trading financial security and a fixed 40-hour work week against intellectual engagement and a flexible lifestyle. And the point to note is that, even if we completely ignore what happens after the six years of grad school are up, there is clearly a non-negligible segment of the population who would quite happy opt for the second choice–even recognizing full well that at the end of six years they may have to leave and move onto something else, with little to show for their effort. (Of course, in reality we don’t need to ignore what happens after six years, because many PhDs who don’t get tenure-track positions find rewarding careers in other fields–many of them scientific in nature. And, even though it may not be a great economic investment, having a Ph.D. in science is a great thing to be able to put on one’s resume when applying for a very broad range of non-academic positions.)

The bottom line is that whether or not you should pursue a career in science has as much or more to do with your goals and personality as it does with the current environment within or outside of (academic) science. In an ideal world (which is certainly what the 1970s as described by Katz sound like, though I wasn’t around then), it wouldn’t matter: if you had any inkling that you wanted to do science for a living, you would simply go to grad school in science, and everything would probably work itself out. But given real-world constraints, it’s absolutely essentially that you think very carefully about what kind of environment makes you happy and what your expectations and goals for the future are. You have to ask yourself: Am I the kind of person who values intellectual freedom more than financial security? Do I really love the process of actually doing science–not some idealized movie version of it, but the actual messy process–enough to warrant investing a huge amount of my time and energy over the next few years? Can I deal with perpetual uncertainty about my future? And ultimately, would I be okay doing something that I really enjoy for six years if at the end of that time I have to walk away and do something very different?

If the answer to all of these questions is yes–and for many people it is!–then pursuing a career in science is still a very good thing to do (and hey, you can always quit early if you don’t like it–then you’ve lost very little time!). If the answer to any of them is no, then Katz may be right. A prospective career in science may or may not be for you, but at the very least, you should carefully consider alternative prospects. There’s absolutely no shame in going either route; the important thing is just to make an honest decision that takes the facts as they are and not as you wish that they were.

A couple of other thoughts I’ll add belatedly:

  • Calling academia a pyramid scheme is admittedly a bit hyperbolic. It’s true that the personnel structure in academia broadly has the shape of a pyramid, but that’s true of most organizations in most other domains too. Pyramid schemes are typically built on promises and lies that (almost by definition) can’t be realized, and I don’t think many people who enter a Ph.D. program in science can claim with a straight face that they were guaranteed a permanent research position at the end of the road (or that it’s impossible to get such a position). As I suggested in this post, it’s much more likely that everyone involved is simply guilty of minor (self-)deception: faculty don’t go out of their way to tell prospective students what the odds are of actually getting a tenure-track position, and prospective grad students don’t work very hard to find out the painful truth, or to tell faculty what their real intentions are after they graduate. And it may actually be better for everyone that way.
  • Just in case it’s not clear from the above, I’m not in any way condoning the historically low levels of science funding, or the fact that very few science PhDs go on to careers in academic research. I would love for NIH and NSF budgets (or whatever your local agency is) to grow substantially–and for everyone get exactly the kind of job they want, academic or not. But that’s not the world we live in, so we may as well be pragmatic about it and try to identify the conditions under which it does or doesn’t make sense to pursue a career in science right now.
  • I briefly mention this above, but it’s probably worth stressing that there are many jobs outside of academia that still allow one to do scientific research, albeit typically with less freedom (but often for better hours and pay). In particular, the market for data scientists is booming right now, and many of the hires are coming directly from academia. One lesson to take away from this is: if you’re in a science Ph.D. program right now, you should really spend as much time as you can building up your quantitative and technical skills, because they could very well be the difference between a job that involves scientific research and one that doesn’t in the event you leave academia. And those skills will still serve you well in your research career even if you end up staying in academia.

 

The homogenization of scientific computing, or why Python is steadily eating other languages’ lunch

Over the past two years, my scientific computing toolbox been steadily homogenizing. Around 2010 or 2011, my toolbox looked something like this:

  • Ruby for text processing and miscellaneous scripting;
  • Ruby on Rails/JavaScript for web development;
  • Python/Numpy (mostly) and MATLAB (occasionally) for numerical computing;
  • MATLAB for neuroimaging data analysis;
  • R for statistical analysis;
  • R for plotting and visualization;
  • Occasional excursions into other languages/environments for other stuff.

In 2013, my toolbox looks like this:

  • Python for text processing and miscellaneous scripting;
  • Ruby on Rails/JavaScript for web development, except for an occasional date with Django or Flask (Python frameworks);
  • Python (NumPy/SciPy) for numerical computing;
  • Python (Neurosynth, NiPy etc.) for neuroimaging data analysis;
  • Python (NumPy/SciPy/pandas/statsmodels) for statistical analysis;
  • Python (MatPlotLib) for plotting and visualization, except for web-based visualizations (JavaScript/d3.js);
  • Python (scikit-learn) for machine learning;
  • Excursions into other languages have dropped markedly.

You may notice a theme here.

The increasing homogenization (Pythonification?) of the tools I use on a regular basis primarily reflects the spectacular recent growth of the Python ecosystem. A few years ago, you couldn’t really do statistics in Python unless you wanted to spend most of your time pulling your hair out and wishing Python were more like R (which, is a pretty remarkable confession considering what R is like). Neuroimaging data could be analyzed in SPM (MATLAB-based), FSL, or a variety of other packages, but there was no viable full-featured, free, open-source Python alternative. Packages for machine learning, natural language processing, web application development, were only just starting to emerge.

These days, tools for almost every aspect of scientific computing are readily available in Python. And in a growing number of cases, they’re eating the competition’s lunch.

Take R, for example. R’s out-of-the-box performance with out-of-memory datasets has long been recognized as its achilles heel (yes, I’m aware you can get around that if you’re willing to invest the time–but not many scientists have the time). But even people who hated the way R chokes on large datasets, and its general clunkiness as a language, often couldn’t help running back to R as soon as any kind of serious data manipulation was required. You could always laboriously write code in Python or some other high-level language to pivot, aggregate, reshape, and otherwise pulverize your data, but why would you want to? The beauty of packages like plyr in R was that you could, in a matter of 2 – 3 lines of code, perform enormously powerful operations that could take hours to duplicate in other languages. The downside was the intensive learning curve associated with learning each package’s often quite complicated API (e.g., ggplot2 is incredibly expressive, but every time I stop using ggplot2 for 3 months, I have to completely re-learn it), and having to contend with R’s general awkwardness. But still, on the whole, it was clearly worth it.

Flash forward to The Now. Last week, someone asked me for some simulation code I’d written in R a couple of years ago. As I was firing up R Studio to dig around for it, I realized that I hadn’t actually fired up R studio for a very long time prior to that moment–probably not in about 6 months. The combination of NumPy/SciPy, MatPlotLib, pandas and statmodels had effectively replaced R for me, and I hadn’t even noticed. At some point I just stopped dropping out of Python and into R whenever I had to do the “real” data analysis. Instead, I just started importing pandas and statsmodels into my code. The same goes for machine learning (scikit-learn), natural language processing (nltk), document parsing (BeautifulSoup), and many other things I used to do outside Python.

It turns out that the benefits of doing all of your development and analysis in one language are quite substantial. For one thing, when you can do everything in the same language, you don’t have to suffer the constant cognitive switch costs of reminding yourself say, that Ruby uses blocks instead of comprehensions, or that you need to call len(array) instead of array.length to get the size of an array in Python; you can just keep solving the problem you’re trying to solve with as little cognitive overhead as possible. Also, you no longer need to worry about interfacing between different languages used for different parts of a project. Nothing is more annoying than parsing some text data in Python, finally getting it into the format you want internally, and then realizing you have to write it out to disk in a different format so that you can hand it off to R or MATLAB for some other set of analyses*. In isolation, this kind of thing is not a big deal. It doesn’t take very long to write out a CSV or JSON file from Python and then read it into R. But it does add up. It makes integrated development more complicated, because you end up with more code scattered around your drive in more locations (well, at least if you have my organizational skills). It means you spend a non-negligible portion of your “analysis” time writing trivial little wrappers for all that interface stuff, instead of thinking deeply about how to actually transform and manipulate your data. And it means that your beautiful analytics code is marred by all sorts of ugly open() and read() I/O calls. All of this overhead vanishes as soon as you move to a single language.

Convenience aside, another thing that’s impressive about the Python scientific computing ecosystem is that a surprising number of Python-based tools are now best-in-class (or close to it) in terms of scope and ease of use–and, in virtue of C bindings, often even in terms of performance. It’s hard to imagine an easier-to-use machine learning package than scikit-learn, even before you factor in the breadth of implemented algorithms, excellent documentation, and outstanding performance. Similarly, I haven’t missed any of the data manipulation functionality in R since I switched to pandas. Actually, I’ve discovered many new tricks in pandas I didn’t know in R (some of which I’ll describe in an upcoming post). Considering that pandas considerably outperforms R for many common operations, the reasons for me to switch back to R or other tools–even occasionally–have dwindled.

Mind you, I don’t mean to imply that Python can now do everything anyone could ever do in other languages. That’s obviously not true. For instance, there are currently no viable replacements for many of the thousands of statistical packages users have contributed to R (if there’s a good analog for lme4 in Python, I’d love to know about it). In signal processing, I gather that many people are wedded to various MATLAB toolboxes and packages that don’t have good analogs within the Python ecosystem. And for people who need serious performance and work with very, very large datasets, there’s often still no substitute for writing highly optimized code in a low-level compiled language. So, clearly, what I’m saying here won’t apply to everyone. But I suspect it applies to the majority of scientists.

Speaking only for myself, I’ve now arrived at the point where around 90 – 95% of what I do can be done comfortably in Python. So the major consideration for me, when determining what language to use for a new project, has shifted from what’s the best tool for the job that I’m willing to learn and/or tolerate using? to is there really no way to do this in Python? By and large, this mentality is a good thing, though I won’t deny that it occasionally has its downsides. For example, back when I did most of my data analysis in R, I would frequently play around with random statistics packages just to see what they did. I don’t do that much any more, because the pain of having to refresh my R knowledge and deal with that thing again usually outweighs the perceived benefits of aimless statistical exploration. Conversely, sometimes I end up using Python packages that I don’t like quite as much as comparable packages in other languages, simply for the sake of preserving language purity. For example, I prefer Rails’ ActiveRecord ORM to the much more explicit SQLAlchemy ORM for Python–but I don’t prefer to it enough to justify mixing Ruby and Python objects in the same application. So, clearly, there are costs. But they’re pretty small costs, and for me personally, the scales have now clearly tipped in favor of using Python for almost everything. I know many other researchers who’ve had the same experience, and I don’t think it’s entirely unfair to suggest that, at this point, Python has become the de facto language of scientific computing in many domains. If you’re reading this and haven’t had much prior exposure to Python, now’s a great time to come on board!

Postscript: In the period of time between starting this post and finishing it (two sessions spread about two weeks apart), I discovered not one but two new Python-based packages for data visualization: Michael Waskom’s seaborn package–which provides very high-level wrappers for complex plots, with a beautiful ggplot2-like aesthetic–and Continuum Analytics’ bokeh, which looks like a potential game-changer for web-based visualization**. At the rate the Python ecosystem is moving, there’s a non-zero chance that by the time you read this, I’ll be using some new Python package that directly transliterates my thoughts into analytics code.

 

* I’m aware that there are various interfaces between Python, R, etc. that allow you to internally pass objects between these languages. My experience with these has not been overwhelmingly positive, and in any case they still introduce all the overhead of writing extra lines of code and having to deal with multiple languages.

** Yes, you heard right: web-based visualization in Python. Bokeh generates static JavaScript and JSON for you from Python code, so  your users are magically able to interact with your plots on a webpage without you having to write a single line of native JS code.

I’m moving to Austin!

The title pretty much says it. After spending four great years in Colorado, I’m happy to say that I’ll be moving to Austin at the end of the month. I’ll be joining the Department of Psychology at UT-Austin as a Research Associate, where I plan to continue dabbling in all things psychological and informatic, but with less snow and more air conditioning.

While my new position nominally has the same title as my old one, the new one’s a bit unusual in that the funding is coming from two quite different sources. Half of it comes from my existing NIH grant for development of the Neurosynth framework, which means that half of my time will be spent more or less the same way I’m spending it now–namely, on building tools to improve and automate the large-scale synthesis of functional MRI data. (Incidentally, I’ll be hiring a software developer and/or postdoc in the very near future, so drop me a line if you think you might be interested.)

The other half of the funding is tied to the PsyHorns course developed by Jamie Pennebaker and Sam Gosling over the past few years. PsyHorns is a synchronous massive online course (SMOC) that lets anyone in the world with an internet connection (okay, and $550 in loose change lying around) take an introductory psychology class via the internet and officially receive credit for it from the University of Texas (this recent WSJ article on PsyHorns provides some more details). My role will be to serve as a bridge between the psychologists and the developers–which means I’ll have an eclectic assortment of duties like writing algorithms to detect cheating, developing tools to predict how well people are doing in the class, mining the gigantic reams of data we’re acquiring, developing ideas for new course features, and, of course, publishing papers.

Naturally, the PILab will be joining me in my southern adventure. Since the PILab currently only has one permanent member (guess who?), and otherwise consists of a single Mac Pro workstation, this latter move involves much less effort than you might think (though it does mean I’ll have to change the lab website’s URL, logo, and–horror of horrors–color scheme). Unfortunately, all the wonderful people of the PILab will be staying behind, as they all have various much more important ties to Boulder (by which I mean that I’m not actually currently paying any of their salaries, and none of them were willing to subsist on the stipend of baked beans, love, and high-speed internet I offered them).

While I’m super excited about moving to Austin, I’m not at all excited to leave Colorado. Boulder is a wonderful place to live*–it’s sunny all the time, has a compact, walkable core, a surprising amount of stuff to do, and these gigantic mountain things you can walk all over. My wife and I have made many incredible friends here, and after four years in Colorado, it’s come to feel very much like home. So leaving will be difficult. Still, I’m excited to move onto new things. As great as the past four years have been, a number of factors precipitated this move:

  • The research fit is better. This isn’t in any way a knock against the environment here at Colorado, which has been great (hey, they’re hiring! If you do computational cognitive neuroscience, you should apply!). I had great colleagues here who work on some really interesting questions–particularly Tor Wager, my postdoc advisor for my first 3 years here, who’s an exceptional scientist and stellar human being. But every department necessarily has to focus on some areas at the expense of others, and much of the research I do (or would ideally like to do) wasn’t well-represented here. In particular, my interests in personality and individual differences have languished during my time in Boulder, as I’ve had trouble finding collaborators for most of the project ideas I’ve had. UT-Austin, by contrast, has one of the premier personality and individual differences groups anywhere. I’m delighted to be working a few doors down from people like Sam Gosling, Jamie Pennebaker, Elliot Tucker-Drob, and David Buss. On top of that, UT-Austin still has major strengths in most of my other areas of interest, most notably neuroimaging (I expect to continue to collaborate frequently with Russ Poldrack) and data mining (a world-class CS department with an expanding focus on Big Data). So, purely in terms of fit, it’s hard for me to imagine a better place than UT.
  • I’m excited to work on a project with immediate real-world impact. While I’d love to believe that most of the work I currently do is making the world better in some very small way, the reality most scientists engaged in basic research face is that at the end of the day, we don’t actually know what impact we’re having. There’s nothing inherently wrong with that, mind you; as a general rule, I’m a big believer in the idea of doing science just because it’s interesting and exciting, without worrying about the consequences (or lack thereof). You know, knowledge for it’s own sake and all that. Still, on a personal level, I find myself increasingly wanting to do something that I feel confers some clear and measurable benefit on the world right now–however small. In that respect, online education strikes me as an excellent area to pour my energy into. And PsyHorns is a particularly unusual (and, to my mind, promising) experiment in online education. The preliminary data from previous iterations of the course suggests that students who take the course synchronously online do better academically–not just in this particular class (as compared to an in-class section), but in other courses as well. While I’m not hugely optimistic about the malleability of the human mind as a general rule–meaning, I don’t think there are as-yet undiscovered teaching approaches that are going to radically improve the learning experience–I do believe strongly in the cumulative impact of many small nudging in the right direction. I think this is the right platform for that kind of nudging.
  • Data. Lots and lots of data. Enrollment in PsyHorns this year is about 1,500 students, and previous iterations have seen comparable numbers. As part of their introduction to psychology, the students engage in a wide range of activities: they have group chats about the material they’re learning; they write essays about a range of topics; they fill out questionnaires and attitude surveys; and, for the first time this year, they use a mobile app that assesses various aspects of their daily experience. Aside from the feedback we provide to the students (some of which is potentially actionable right away), the data we’re collecting provides a unique opportunity to address many questions at the intersection of personality and individual differences, health and subjective well-being, and education. It’s not Big Data by, say, Google or Amazon standards (we’re talking thousands of rows rather than billions), but it’s a dataset with few parallels in psychology, and I’m thrilled to be able to work on it.
  • I like doing research more than I like teaching** or doing service work. Like my current position, the position I’m assuming at UT-Austin is 100% research-focused, with very little administrative or teaching overhead. Obviously, it doesn’t have the long-term security of a tenure-track position, but I’m okay with that. I’m still selectively applying for tenure-track positions (and turned one down this year in favor of the UT position), so it’s not as though I have any principled objections to the tenure stream. But short of a really amazing opportunity, I’m very happy with my current arrangement.
  • mmm, chocolatey Austin goodness...
    Austin seems like a pretty awesome place to live. Boulder is too, but after four years of living in a relatively small place (population: ~100,000), my wife and I are looking forward to living somewhere more city-like. We’ve opted to take the (expensive) plunge and live downtown–where we’ll be within walking distance of just about everything we need. By which of course I mean the chocolate fountain at the Whole Foods mothership.
  • The tech community in Austin is booming. Given that most of my work these days lies at the interface of psychology and informatics, and there are unprecedented opportunities for psychology-related data mining in industry these days, I’m hoping to develop better collaborations with people in industry–at both startups and established companies. While I have no intention of leaving academia in the near future, I do think psychologists have collectively failed to take advantage of the many opportunities to collaborate with folks in industry on interesting questions about human behavior–often at an unprecedented scale. I’ve done a terrible job of that myself, and fixing that is near the top of my agenda. So, hey, if you work at a tech company in Austin and have some data lying around that you think might shed new insights on what people feel, think, and do, let’s chat!
  • I guess sometimes you just get the itch to move onto something new. For me, this is that.

University of Texas Austin campus at sunset-dusk - aerial view

 

 

* Okay, it was an amazing place to live until the massive floods this past week rearranged rivers, roads, and lives. My wife and I  were fortunate enough to escape any personal or material damage, but many others were not so lucky. If you’d like to help, please consider making a donation.

** Actually, I love teaching. What I don’t love is all the stuff surrounding teaching.

Jirafas

This is fiction.

The party is supposed to start at 7 pm, but of course, no one shows up before 8:45. When the guests finally do arrive, I randomly assign each of them to one of four groups–A through D–as they enter. Each assignment comes with an adhesive 2″ color patch, a nametag, and a sharpie.

The labels are not for the dinner,” I say, “they’re for the orgy that follows the dinner. The bedrooms are all color-coded; there are strict rules governing inter-cubicular transitions. Please read the manual on the table.”

Nobody moves to pick up the manual. There’s a long and uncomfortable silence, made longer and more uncomfortable by the fact that we can all hear the upstairs neighbors loudly having sex on their kitchen counter.

“Turn on the music,” my wife says. “It masks the sex.”

I put on some music. Something soft, by Elton John, followed by something angry—a duet by Tenacious D and Leonard Skynyrd. One of the guests—unsoothed by the music, and noticing the random collection of chairs scattered around the living room—grows restless and asks whether we will all be playing musical chairs this fine evening.

“No,” I reply; “this fine night, we all play Mafia.” Then I shoot him dead as everyone else pretends to stare out the window.

In the kitchen, my wife uncorks the last bottle of wine. As trendy wines go, this one wears its pretention with pride: Jugo de Jirafas, the label proclaims in vermilion Helvetica Neue overtones.

“What does jirafas mean,” I ask my Spanish friend. “Giraffes?”

“No,” she says. “Jirafas was a famous rebel general who came out of hiding during the Spanish Civil War to challenge Franco to a fight to the death. They brawled in the streets for hours, and and just when it looked like Jirafas was about to snap Franco’s neck, Franco screamed for his deputies, who immediately pumped several rounds straight through Jirafas’s heart. They say the body continued to bleed courage into the street for several weeks.”

Jugo de Jirafas, I enunciate out loud.

There’s an awkward silence in the living room as the assembled guests all hold an involuntary thirty-second vigil for the dearly departed General Jirafas, who was taken from us much too soon. Poor man—we barely knew him.

Then the vigil is broken up by the arrival of my Brazilian friend João, who lives across the way. Our housing complex is nominally open to all faculty and staff affiliated with the university, but in practice it more or less operates as a kind of hippie commune for expatriate scientists. On any given day you can hear forty different languages being spoken, and stumble across marauding groups of eight-year old children all babbling away at each other in mutual incomprehension. Walking through our apartment complex is like taking a simultaneous trip through every foreign-language channel on extended cable.

It does have its perks, though. For example, if you want to experience other cultures, you don’t need to travel anywhere. When people suggest that I’ve been working too hard and need a vacation, I yell at João through the bedroom window: how’s Rio this time of year?

Exceptional, he’ll yell back. The cannonball trees are in full bloom. You should come for a visit.

Then I usually take a bottle of wine over—nothing of Jugo de Jirafas caliber, just a basic Zinfandel from Whole Foods—and we sit around and talk about the strange places we’ve lived: Rio and Istanbul for him; Mombasa and Ottawa for me. After dinner we usually play a few games of backgammon, which is not a Brazilian game at all, but is acceptable to play because João spent three years of his life doing a postdoc in Turkey. Thus begins and ends my cosmetic Latin American vacation, punctuated by a detour to the Near East.

Tonight, João shows up with a German lady on his arm. She’s a newly arrived faculty member in the Department of Earth Sciences.

“This is the bad Jew I was telling you about,” he says to the lady by way of introduction.

“It’s true,” I say; “I’m a very bad Jew. Even by Jewish standards.”

She wants to know what makes a Jew a bad Jew. I tell her I eat bacon on the Sabbath and wrap myself in cheeseburgers before bed. And that I make sure to drink the blood of goyim at least four times a year. And that I’m so money-hungry and cunning, I’ve been banned from lending money even to other Jews.

My joke doesn’t go over so well. Germans have had, for obvious reasons, a lot of trouble putting the war behind them. When you make Jew jokes in Germany, people give you a look that’s made up of one part contempt, one part cognitive dissonance. They don’t know what to do; it’s like you’ve lit a warehouse full of bottle rockets up inside their heads all at once. As an American, I don’t mind this, of course. In America, it’s your god-given birthright to make ethnic jokes at your own expense. As long as you’re making fun only of your own in-group and nobody else, no one is allowed to come between you and your chuckles.

The German lady doesn’t see it this way.

“You should not make fun of the Jews,” she says in over-articled English. “Even if you are a one yourself.”

“Well,” says I. “If you can’t laugh at yourself, who can you laugh at?”

She shrugs her shoulders.

“Other people,” offers João.

So I laugh at João, because he’s another person. There’s an uncomfortable pause, but then the earth scientist–whose name turns out to be Brunhilde–laughs too. A moment later, we’re all making small talk again, and I feel pretty confident that any budding crisis in diplomatic relations has been averted.

“Speaking of making fun of others,” João says, “what happened to your lip? It looks like you have the herpes.”

“I damaged myself while flossing,” I tell him.

It’s true: I have a persistent cut on my lip caused by aggressive flossing. It refuses to heal. And now, after several days of incubation, it looks exactly like a cold sore. So I have to walk around my life constantly putting up with herpes jokes.

“I’ll go put something on it,” I say, self-consciously rubbing at the wound. “You just stand here and keep laughing at me, you anti-semite.”

Turns out, I’ve forgotten the name of the lip balm my wife buys. So I walk around the party with a chafed, bloody lip, asking everyone I know if they’ve seen my Tampax. The guests mostly demur quietly, but one particularly mercurial friend looks slightly alarmed, and slowly starts to edge towards the door.

He means Carmex, my wife yells from the kitchen.

Eventually, all of the wine is drunk and the conversation is spent. The guests begin to leave, each one curling his or her self carefully through the doorway in sequence. For some reason, they remind me of ants circling around a drain—but I don’t tell anyone that. There is no longer any music; there was never an orgy. There are no more Jew jokes. I turn the phonograph off—by which I mean I press the stop button on my iTunes playlist—and dim the lights. My wife stays downstairs.

“To do some research,” she says.

Much later, just as I’m making the delicate nightly transition from restless leg syndrome to stage 1 sleep, I’m suddenly jarred wide awake by the sound of someone cursing loudly and repeatedly as they get into bed next to me. I vaguely recognize my wife’s voice, though it sounds different over the haze of near-sleep and a not-insignificant amount of wine.

What’s going on, I ask her.

She mutters that she’s just spent the last hour and a half exhausting the infinite wisdom of Google, circumnavigating the information superhighway, and consulting with various technical support workers scattered all around the Indian subcontinent. And the clear consensus among all sources is that there is not now, and never was, any General Jirafas.

“It just means giraffes,” she says.

…and then there were two!

Last year when I launched my lab (which, full disclosure, is really just me, plus some of my friends who were kind enough to let me plaster their names and faces on my website), I decided to call it the Psychoinformatics Lab (or PILab for short and pretentious), because, well, why not. It seemed to nicely capture what my research is about: psychology and informatics. But it wasn’t an entirely comfortable decision, because a non-trivial portion of my brain was quite convinced that everyone was going to laugh at me. And even now, after more than a year of saying I’m a “psychoinformatician” whenever anyone asks me what I do, I still feel a little bit fraudulent each time–as if I’d just said I was a member of the Estonian Cosmonaut program, or the president of the Build-a-Bear fan club*.

But then… just last week… everything suddenly changed! All in one fell swoop–in one tiny little nudge of a shove-this-on-the-internet button, things became magically better. And now colors are vibrating**, birds are chirping merry chirping songs–no, wait, those are actually cicadas–and the world is basking in a pleasant red glow of humming monitors and five-star Amazon reviews. Or something like that. I’m not so good with the metaphors.

Why so upbeat, you ask? Well, because as of this writing, there is no longer just the one lone Psychoinformatics Lab. No! Now there are not one, not three, not seven Psychoinformatics Labs, but… two! There are two Psychoinformatics Labs. The good Dr. Michael Hanke (of PyMVPA and NeuroDebian fame) has just finished putting the last coat of paint on the inside of his brand new cage Psychoinformatics Lab at the Otto-von-Guericke University Magdeburg in Magdeburg, Germany. No, really***: his startup package didn’t include any money for paint, so he had to barter his considerable programming skills for three buckets of Going to the Chapel (yes, that’s a real paint color).

The good Dr. Hanke drifts through interstellar space in search of new psychoinformatic horizons.

Anyway, in case you can’t tell, I’m quite excited about this. Not because it’s a sign that informatics approaches are making headway in psychology, or that pretty soon every psychology lab will have a high-performance computing cluster hiding in its closet (one can dream, right?). No sir. I’m excited for two much more pedestrian reasons. First, because from now on, any time anyone makes fun of me for calling myself a psychoinformatician, I’ll be able to say, with a straight face, well it’s not just me, you know–there are multiple ones of us doing this here research-type thing with the data and the psychology and the computers. And secondly, because Michael is such a smart and hardworking guy that I’m pretty sure he’s going to legitimize this whole enterprise and drag me along for the ride with him, so I won’t have to do anything else myself. Which is good, because if laziness was an olympic sport, I’d never leave the starting block.

No, but in all seriousness, Michael is an excellent scientist and an exceptional human being, and I couldn’t be happier for him in his new job as Lord Director of All Things Psychoinformatic (Eastern Division). You might think I’m only saying this because he just launched the world’s second PILab, complete with quote from yours truly on said lab’s website front page. Well, you’d be right. But still. He’s a pretty good guy, and I’m sure we’re going to see amazing things coming out of Magdeburg.

Now if anyone wants to launch PILab #3 (maybe in Asia or South America?), just let me know, and I’ll make you the same offer I made Michael: an envelope full of $1 bills (well, you know, I’m an academic–I can’t afford Benjamins just yet) and a blog post full of ridiculous superlatives.

 

* Perhaps that’s not a good analogy, because that one may actually exist.

** But seriously, in real life, colors should not vibrate. If you ever notice colors vibrating, drive to the nearest emergency room and tell them you’re seeing colors vibrating.

*** No, not really.

what do you get when you put 1,000 psychologists together in one journal?

I’m working on a TOP SEKKRIT* project involving large-scale data mining of the psychology literature. I don’t have anything to say about the TOP SEKKRIT* project just yet, but I will say that in the process of extracting certain information I needed in order to do certain things I won’t talk about, I ended up with certain kinds of data that are useful for certain other tangential analyses. Just for fun, I threw some co-authorship data from 2,000+ Psychological Science articles into the d3.js blender, and out popped an interactive network graph of all researchers who have published at least 2 papers in Psych Science in the last 10 years**. It looks like this:

coauthorship_graph

You can click on the image to take a closer (and interactive) look.

I don’t think this is very useful for anything right now, but if nothing else, it’s fun to drag Adam Galinsky around the screen and watch half of the field come along for the ride. There are plenty of other more interesting things one could do with this, though, and it’s also quite easy to generate the same graph for other journals, so I expect to have more to say about this later on.

 

* It’s not really TOP SEKKRIT at all–it just sounds more exciting that way.

** Or, more accurately, researchers who have co-authored at least 2 Psych Science papers with other researchers who meet the same criterion. Otherwise we’d have even more nodes in the graph, and as you can see, it’s already pretty messy.

the truth is not optional: five bad reasons (and one mediocre one) for defending the status quo

You could be forgiven for thinking that academic psychologists have all suddenly turned into professional whistleblowers. Everywhere you look, interesting new papers are cropping up purporting to describe this or that common-yet-shady methodological practice, and telling us what we can collectively do to solve the problem and improve the quality of the published literature. In just the last year or so, Uri Simonsohn introduced new techniques for detecting fraud, and used those tools to identify at least 3 cases of high-profile, unabashed data forgery. Simmons and colleagues reported simulations demonstrating that standard exploitation of research degrees of freedom in analysis can produce extremely high rates of false positive findings. Pashler and colleagues developed a “Psych file drawer” repository for tracking replication attempts. Several researchers raised trenchant questions about the veracity and/or magnitude of many high-profile psychological findings such as John Bargh’s famous social priming effects. Wicherts and colleagues showed that authors of psychology articles who are less willing to share their data upon request are more likely to make basic statistical errors in their papers. And so on and so forth. The flood shows no signs of abating; just last week, the APS journal Perspectives in Psychological Science announced that it’s introducing a new “Registered Replication Report” section that will commit to publishing pre-registered high-quality replication attempts, irrespective of their outcome.

Personally, I think these are all very welcome developments for psychological science. They’re solid indications that we psychologists are going to be able to police ourselves successfully in the face of some pretty serious problems, and they bode well for the long-term health of our discipline. My sense is that the majority of other researchers–perhaps the vast majority–share this sentiment. Still, as with any zeitgeist shift, there are always naysayers. In discussing these various developments and initiatives with other people, I’ve found myself arguing, with somewhat surprising frequency, with people who for various reasons think it’s not such a good thing that Uri Simonsohn is trying to catch fraudsters, or that social priming findings are being questioned, or that the consequences of flexible analyses are being exposed. Since many of the arguments I’ve come across tend to recur, I thought I’d summarize the most common ones here–along with the rebuttals I usually offer for why, with one possible exception, the arguments for giving a pass to sloppy-but-common methodological practices are not very compelling.

“But everyone does it, so how bad can it be?”

We typically assume that long-standing conventions must exist for some good reason, so when someone raises doubts about some widespread practice, it’s quite natural to question the person raising the doubts rather than the practice itself. Could it really, truly be (we say) that there’s something deeply strange and misguided about using p values? Is it really possible that the reporting practices converged on by thousands of researchers in tens of thousands of neuroimaging articles might leave something to be desired? Could failing to correct for the many researcher degrees of freedom associated with most datasets really inflate the false positive rate so dramatically?

The answer to all these questions, of course, is yes–or at least, we should allow that it could be yes. It is, in principle, entirely possible for an entire scientific field to regularly do things in a way that isn’t very good. There are domains where appeals to convention or consensus make perfect sense, because there are few good reasons to do things a certain way except inasmuch as other people do them the same way. If everyone else in your country drives on the right side of the road, you may want to consider driving on the right side of the road too. But science is not one of those domains. In science, there is no intrinsic benefit to doing things just for the sake of convention. In fact, almost by definition, major scientific advances are ones that tend to buck convention and suggest things that other researchers may not have considered possible or likely.

In the context of common methodological practice, it’s no defense at all to say but everyone does it this way, because there are usually relatively objective standards by which we can gauge the quality of our methods, and it’s readily apparent that there are many cases where the consensus approach leave something to be desired. For instance, you can’t really justify failing to correct for multiple comparisons when you report a single test that’s just barely significant at p < .05 on the grounds that nobody else corrects for multiple comparisons in your field. That may be a valid explanation for why your paper successfully got published (i.e., reviewers didn’t want to hold your feet to the fire for something they themselves are guilty of in their own work), but it’s not a valid defense of the actual science. If you run a t-test on randomly generated data 20 times, you will, on average, get a significant result, p < .05, once. It does no one any good to argue that because the convention in a field is to allow multiple testing–or to ignore statistical power, or to report only p values and not effect sizes, or to omit mention of conditions that didn’t ‘work’, and so on–it’s okay to ignore the issue. There’s a perfectly reasonable question as to whether it’s a smart career move to start imposing methodological rigor on your work unilaterally (see below), but there’s no question that the mere presence of consensus or convention surrounding a methodological practice does not make that practice okay from a scientific standpoint.

“But psychology would break if we could only report results that were truly predicted a priori!”

This is a defense that has some plausibility at first blush. It’s certainly true that if you force researchers to correct for multiple comparisons properly, and report the many analyses they actually conducted–and not just those that “worked”–a lot of stuff that used to get through the filter will now get caught in the net. So, by definition, it would be harder to detect unexpected effects in one’s data–even when those unexpected effects are, in some sense, ‘real’. But the important thing to keep in mind is that raising the bar for what constitutes a believable finding doesn’t actually prevent researchers from discovering unexpected new effects; all it means is that it becomes harder to report post-hoc results as pre-hoc results. It’s not at all clear why forcing researchers to put in more effort validating their own unexpected finding is a bad thing.

In fact, forcing researchers to go the extra mile in this way would have one exceedingly important benefit for the field as a whole: it would shift the onus of determining whether an unexpected result is plausible enough to warrant pursuing away from the community as a whole, and towards the individual researcher who discovered the result in the first place. As it stands right now, if I discover an unexpected result (p < .05!) that I can make up a compelling story for, there’s a reasonable chance I might be able to get that single result into a short paper in, say, Psychological Science. And reap all the benefits that attend getting a paper into a “high-impact” journal. So in practice there’s very little penalty to publishing questionable results, even if I myself am not entirely (or even mostly) convinced that those results are reliable. This state of affairs is, to put it mildly, not A Good Thing.

In contrast, if you as an editor or reviewer start insisting that I run another study that directly tests and replicates my unexpected finding before you’re willing to publish my result, I now actually have something at stake. Because it takes time and money to run new studies, I’m probably not going to bother to follow up on my unexpected finding unless I really believe it. Which is exactly as it should be: I’m the guy who discovered the effect, and I know about all the corners I have or haven’t cut in order to produce it; so if anyone should make the decision about whether to spend more taxpayer money chasing the result, it should be me. You, as the reviewer, are not in a great position to know how plausible the effect truly is, because you have no idea how many different types of analyses I attempted before I got something to ‘work’, or how many failed studies I ran that I didn’t tell you about. Given the huge asymmetry in information, it seems perfectly reasonable for reviewers to say, You think you have a really cool and unexpected effect that you found a compelling story for? Great; go and directly replicate it yourself and then we’ll talk.

“But mistakes happen, and people could get falsely accused!”

Some people don’t like the idea of a guy like Simonsohn running around and busting people’s data fabrication operations for the simple reason that they worry that the kind of approach Simonsohn used to detect fraud is just not that well-tested, and that if we’re not careful, innocent people could get swept up in the net. I think this concern stems from fundamentally good intentions, but once again, I think it’s also misguided.

For one thing, it’s important to note that, despite all the press, Simonsohn hasn’t actually done anything qualitatively different from what other whistleblowers or skeptics have done in the past. He may have suggested new techniques that improve the efficiency with which cheating can be detected, but it’s not as though he invented the ability to report or investigate other researchers for suspected misconduct. Researchers suspicious of other researchers’ findings have always used qualitatively similar arguments to raise concerns. They’ve said things like, hey, look, this is a pattern of data that just couldn’t arise by chance, or, the numbers are too similar across different conditions.

More to the point, perhaps, no one is seriously suggesting that independent observers shouldn’t be allowed to raise their concerns about possible misconduct with journal editors, professional organizations, and universities. There really isn’t any viable alternative. Naysayers who worry that innocent people might end up ensnared by false accusations presumably aren’t suggesting that we do away with all of the existing mechanisms for ensuring accountability; but since the role of people like Simonsohn is only to raise suspicion and provide evidence (and not to do the actual investigating or firing), it’s clear that there’s no way to regulate this type of behavior even if we wanted to (which I would argue we don’t). If I wanted to spend the rest of my life scanning the statistical minutiae of psychology articles for evidence of misconduct and reporting it to the appropriate authorities (and I can assure you that I most certainly don’t), there would be nothing anyone could do to stop me, nor should there be. Remember that accusing someone of misconduct is something anyone can do, but establishing that misconduct has actually occurred is a serious task that requires careful internal investigation. No one–certainly not Simonsohn–is suggesting that a routine statistical test should be all it takes to end someone’s career. In fact, Simonsohn himself has noted that he identified a 4th case of likely fraud that he dutifully reported to the appropriate authorities only to be met with complete silence. Given all the incentives universities and journals have to look the other way when accusations of fraud are made, I suspect we should be much more concerned about the false negative rate than the false positive rate when it comes to fraud.

“But it hurts the public’s perception of our field!”

Sometimes people argue that even if the field does have some serious methodological problems, we still shouldn’t discuss them publicly, because doing so is likely to instill a somewhat negative view of psychological research in the public at large. The unspoken implication being that, if the public starts to lose confidence in psychology, fewer students will enroll in psychology courses, fewer faculty positions will be created to teach students, and grant funding to psychologists will decrease. So, by airing our dirty laundry in public, we’re only hurting ourselves. I had an email exchange with a well-known researcher to exactly this effect a few years back in the aftermath of the Vul et al “voodoo correlations” paper–a paper I commented on to the effect that the problem was even worse than suggested. The argument my correspondent raised was, in effect, that we (i.e., neuroimaging researchers) are all at the mercy of agencies like NIH to keep us employed, and if it starts to look like we’re clowning around, the unemployment rate for people with PhDs in cognitive neuroscience might start to rise precipitously.

While I obviously wouldn’t want anyone to lose their job or their funding solely because of a change in public perception, I can’t say I’m very sympathetic to this kind of argument. The problem is that it places short-term preservation of the status quo above both the long-term health of the field and the public’s interest. For one thing, I think you have to be quite optimistic to believe that some of the questionable methodological practices that are relatively widespread in psychology (data snooping, selective reporting, etc.) are going to sort themselves out naturally if we just look the other way and let nature run its course. The obvious reason for skepticism in this regard is that many of the same criticisms have been around for decades, and it’s not clear that anything much has improved. Maybe the best example of this is Gigerenzer and Sedlmeier’s 1989 paper entitled “Do studies of statistical power have an effect on the power of studies?“, in which the authors convincingly showed that despite three decades of work by luminaries like Jacob Cohen advocating power analyses, statistical power had not risen appreciably in psychology studies. The presence of such unwelcome demonstrations suggests that sweeping our problems under the rug in the hopes that someone (the mice?) will unobtrusively take care of them for us is wishful thinking.

In any case, even if problems did tend to solve themselves when hidden away from the prying eyes of the media and public, the bigger problem with what we might call the “saving face” defense is that it is, fundamentally, an abuse of taxypayers’ trust. As with so many other things, Richard Feynman summed up the issue eloquently in his famous Cargo Cult science commencement speech:

For example, I was a little surprised when I was talking to a friend who was going to go on the radio. He does work on cosmology and astronomy, and he wondered how he would explain what the applications of this work were. “Well,” I said, “there aren’t any.” He said, “Yes, but then we won’t get support for more research of this kind.” I think that’s kind of dishonest. If you’re representing yourself as a scientist, then you should explain to the layman what you’re doing–and if they don’t want to support you under those circumstances, then that’s their decision.

The fact of the matter is that our livelihoods as researchers depend directly on the goodwill of the public. And the taxpayers are not funding our research so that we can “discover” interesting-sounding but ultimately unreplicable effects. They’re funding our research so that we can learn more about the human mind and hopefully be able to fix it when it breaks. If a large part of the profession is routinely employing practices that are at odds with those goals, it’s not clear why taxpayers should be footing the bill. From this perspective, it might actually be a good thing for the field to revise its standards, even if (in the worst-case scenario) that causes a short-term contraction in employment.

“But unreliable effects will just fail to replicate, so what’s the big deal?”

This is a surprisingly common defense of sloppy methodology, maybe the single most common one. It’s also an enormous cop-out, since it pre-empts the need to think seriously about what you’re doing in the short term. The idea is that, since no single study is definitive, and a consensus about the reality or magnitude of most effects usually doesn’t develop until many studies have been conducted, it’s reasonable to impose a fairly low bar on initial reports and then wait and see what happens in subsequent replication efforts.

I think this is a nice ideal, but things just don’t seem to work out that way in practice. For one thing, there doesn’t seem to be much of a penalty for publishing high-profile results that later fail to replicate. The reason, I suspect, is that we incline to give researchers the benefit of the doubt: surely (we say to ourselves), Jane Doe did her best, and we like Jane, so why should we question the work she produces? If we’re really so skeptical about her findings, shouldn’t we go replicate them ourselves, or wait for someone else to do it?

While this seems like an agreeable and fair-minded attitude, it isn’t actually a terribly good way to look at things. Granted, if you really did put in your best effort–dotted all your i’s and crossed all your t’s–and still ended up reporting a false result, we shouldn’t punish you for it. I don’t think anyone is seriously suggesting that researchers who inadvertently publish false findings should be ostracized or shunned. On the other hand, it’s not clear why we should continue to celebrate scientists who ‘discover’ interesting effects that later turn out not to replicate. If someone builds a career on the discovery of one or more seemingly important findings, and those findings later turn out to be wrong, the appropriate attitude is to update our beliefs about the merit of that person’s work. As it stands, we rarely seem to do this.

In any case, the bigger problem with appeals to replication is that the delay between initial publication of an exciting finding and subsequent consensus disconfirmation can be very long, and often spans entire careers. Waiting decades for history to prove an influential idea wrong is a very bad idea if the available alternative is to nip the idea in the bud by requiring stronger evidence up front.

There are many notable examples of this in the literature. A well-publicized recent one is John Bargh’s work on the motor effects of priming people with elderly stereotypes–namely, that priming people with words related to old age makes them walk away from the experiment more slowly. Bargh’s original paper was published in 1996, and according to Google Scholar, has now been cited over 2,000 times. It has undoubtedly been hugely influential in directing many psychologists’ research programs in certain directions (in many cases, in directions that are equally counterintuitive and also now seem open to question). And yet it’s taken over 15 years for a consensus to develop that the original effect is at the very least much smaller in magnitude than originally reported, and potentially so small as to be, for all intents and purposes, “not real”. I don’t know who reviewed Bargh’s paper back in 1996, but I suspect that if they ever considered the seemingly implausible size of the effect being reported, they might have well thought to themselves, well, I’m not sure I believe it, but that’s okay–time will tell. Time did tell, of course; but time is kind of lazy, so it took fifteen years for it to tell. In an alternate universe, a reviewer might have said, well, this is a striking finding, but the effect seems implausibly large; I would like you to try to directly replicate it in your lab with a much larger sample first. I recognize that this is onerous and annoying, but my primary responsibility is to ensure that only reliable findings get into the literature, and inconveniencing you seems like a small price to pay. Plus, if the effect is really what you say it is, people will be all the more likely to believe you later on.

Or take the actor-observer asymmetry, which appears in just about every introductory psychology textbook written in the last 20 – 30 years. It states that people are relatively more likely to attribute their own behavior to situational factors, and relatively more likely to attribute other agents’ behaviors to those agents’ dispositions. When I slip and fall, it’s because the floor was wet; when you slip and fall, it’s because you’re dumb and clumsy. This putative asymmetry was introduced and discussed at length in a book by Jones and Nisbett in 1971, and hundreds of studies have investigated it at this point. And yet a 2006 meta-analysis by Malle suggested that the cumulative evidence for the actor-observer asymmetry is actually very weak. There are some specific circumstances under which you might see something like the postulated effect, but what is quite clear is that it’s nowhere near strong enough an effect to justify being routinely invoked by psychologists and even laypeople to explain individual episodes of behavior. Unfortunately, at this point it’s almost impossible to dislodge the actor-observer asymmetry from the psyche of most researchers–a reality underscored by the fact that the Jones and Nisbett book has been cited nearly 3,000 times, whereas the 1996 meta-analysis has been cited only 96 times (a very low rate for an important and well-executed meta-analysis published in Psychological Bulletin).

The fact that it can take many years–whether 15 or 45–for a literature to build up to the point where we’re even in a position to suggest with any confidence that an initially exciting finding could be wrong means that we should be very hesitant to appeal to long-term replication as an arbiter of truth. Replication may be the gold standard in the very long term, but in the short and medium term, appealing to replication is a huge cop-out. If you can see problems with an analysis right now that cast aspersions on a study’s results, it’s an abdication of responsibility to downplay your concerns and wait for someone else to come along and spend a lot more time and money trying to replicate the study. You should point out now why you have concerns. If the authors can address them, the results will look all the better for it. And if the authors can’t address your concerns, well, then, you’ve just done science a service. If it helps, don’t think of it as a matter of saying mean things about someone else’s work, or of asserting your own ego; think of it as potentially preventing a lot of very smart people from wasting a lot of time chasing down garden paths–and also saving a lot of taxpayer money. Remember that our job as scientists is not to make other scientists’ lives easy in the hopes they’ll repay the favor when we submit our own papers; it’s to establish and apply standards that produce convergence on the truth in the shortest amount of time possible.

“But it would hurt my career to be meticulously honest about everything I do!”

Unlike the other considerations listed above, I think the concern that being honest carries a price when it comes to do doing research has a good deal of merit to it. Given the aforementioned delay between initial publication and later disconfirmation of findings (which even in the best case is usually longer than the delay between obtaining a tenure-track position and coming up for tenure), researchers have many incentives to emphasize expediency and good story-telling over accuracy, and it would be disingenuous to suggest otherwise. No malevolence or outright fraud is implied here, mind you; the point is just that if you keep second-guessing and double-checking your analyses, or insist on routinely collecting more data than other researchers might think is necessary, you will very often find that results that could have made a bit of a splash given less rigor are actually not particularly interesting upon careful cross-examination. Which means that researchers who have, shall we say, less of a natural inclination to second-guess, double-check, and cross-examine their own work will, to some degree, be more likely to publish results that make a bit of a splash (it would be nice to believe that pre-publication peer review filters out sloppy work, but empirically, it just ain’t so). So this is a classic tragedy of the commons: what’s good for a given individual, career-wise, is clearly bad for the community as a whole.

I wish I had a good solution to this problem, but I don’t think there are any quick fixes. The long-term solution, as many people have observed, is to restructure the incentives governing scientific research in such a way that individual and communal benefits are directly aligned. Unfortunately, that’s easier said than done. I’ve written a lot both in papers (1, 2, 3) and on this blog (see posts linked here) about various ways we might achieve this kind of realignment, but what’s clear is that it will be a long and difficult process. For the foreseeable future, it will continue to be an understandable though highly lamentable defense to say that the cost of maintaining a career in science is that one sometimes has to play the game the same way everyone else plays the game, even if it’s clear that the rules everyone plays by are detrimental to the communal good.

 

Anyway, this may all sound a bit depressing, but I really don’t think it should be taken as such. Personally I’m actually very optimistic about the prospects for large-scale changes in the way we produce and evaluate science within the next few years. I do think we’re going to collectively figure out how to do science in a way that directly rewards people for employing research practices that are maximally beneficial to the scientific community as a whole. But I also think that for this kind of change to take place, we first need to accept that many of the defenses we routinely give for using iffy methodological practices are just not all that compelling.

the seedy underbelly

This is fiction. Science will return shortly.


Cornelius Kipling doesn’t take No for an answer. He usually takes several of them–several No’s strung together in rapid sequence, each one louder and more adamant than the last one.

“No,” I told him over dinner at the Rhubarb Club one foggy evening. “No, no, no. I won’t bankroll your efforts to build a new warp drive.”

“But the last one almost worked,” Kip said pleadingly. “I almost had it down before the hull gave way.”

I conceded that it was a clever idea; everyone before Kip had always thought of warp drives as something you put on spaceships. Kip decided to break the mold by placing one on a hydrofoil. Which, naturally, made the boat too heavy to rise above the surface of the water. In fact, it made the boat too heavy to do anything but sink.

“Admittedly, the sinking thing is a small problem,” he said, as if reading my thoughts. “But I’m working on a way to adjust for the extra weight and get it to rise clear out of the water.”

“Good,” I said. “Because lifting the boat out of the water seems like a pretty important step on the road to getting it to travel through space at light speed.”

“Actually, it’s the only remaining technical hurdle,” said Kip. “Once it’s out of the water, everything’s already taken care of. I’ve got onboard fission reactors for power, and a tentative deal to use the International Space Station for supplies. Virgin Galactic is ready to license the technology as soon as we pull off a successful trial run. And there’s an arrangement with James Cameron’s new asteroid mining company to supply us with fuel as we boldly go where… well, you know.”

“Right,” I said, punching my spoon into my crème brûlée in frustration. The crème brûlée retaliated by splattering itself all over my face and jacket.

“See, this kind of thing wouldn’t happen to you if you invested in my company,” Kip helpfully suggested as he passed me an extra napkin. “You’d have so much money other people would feed you. People with ten or fifteen years of experience wielding dessert spoons.”


After dinner we headed downtown. Kip said there was a new bar called Zygote he wanted to show me.

“Actually, it’s not a new bar per se,” he explained as we were leaving the Rhubarb. “It’s new to me. Turns out it’s been here for several years, but you have to know someone to get in. And that someone has to be willing to sponsor you. They review your biography, look up your criminal record, make sure you’re the kind of person they want at the bar, and so on.”

“Sounds like an arranged marriage.”

“You’re not too far off. When you’re first accepted as a member, you’re supposed to give Zygote a dowry of $2,000.”

“That’s a joke, right?” I asked.

“Yes. There’s no dowry. Just the fee.”

“Two thousand dollars? Really?”

“Well, more like fifty a year. But same principle.”

We walked down the mall in silence. I could feel the insoles of my shoes wrapping themselves around my feet, and I knew they were desperately warning me to get away from Kip while I still had a limited amount of sobriety and dignity left.

“How would anyone manage to keep a place like that secret?” I asked. “Especially on the mall.”

“They hire hit men,” Kip said solemnly.

I suspected he was joking, but couldn’t swear to it. I mean, if you didn’t know Kip, you would probably have thought that the idea of putting a warp drive on a hydrofoil was also a big joke.

Kip led us into one of the alleys off Pearl Street, where he quickly located an unobtrusive metal panel set into the wall just below eye level. The panel opened inwards when we pushed it. Behind the panel, we found a faint smell of old candles and a flight of stairs. At the bottom of the stairs–which turned out to run three stories down–we came to another door. This one didn’t open when we pushed it. Instead, Kip knocked on it three times. Then twice more. Then four times.

“Secret code?” I asked.

“No. Obsessive-compulsive disorder.”

The door swung open.

“Evening, Ashraf,” Kip said to the doorman as we stepped through. Ashraf was a tiny Middle Eastern man, very well dressed. Suede pants, cashmere scarf, fedora on his head. Feather in the fedora. The works. I guess when your bar is located behind a false wall three stories below grade, you don’t really need a lot of muscle to keep the peasants out; you knock them out with panache.

“Welcome to Zygote,” Ashraf said. His bland tone made it clear that, truthfully, he wasn’t at all interested in welcoming anyone anywhere. Which made him exactly the kind of person an establishment like this would want as its doorman.

Inside, the bar was mostly empty. There were twelve or fifteen patrons scattered across various booths and animal-print couches. They all took great care not to make eye contact with us as we entered.

“I have to confess,” I whispered to Kip as we made our way to the bar. “Until about three seconds ago, I didn’t really believe you that this place existed.”

“No worries,” he said. “Until about three seconds ago, it had no idea you existed either.”

He looked around.

“Actually, I’m still not sure it knows you exist,” he added apologetically.

“I feel like I’m giving everyone the flu just by standing here,” I told him.

We took a seat at the end of the bar and motioned to the bartender, who looked to be high on a designer drug chemically related to apathy. She eventually wandered over to us–but not before stopping to inspect the countertop, a stack of coasters with pictures of archaeological sites on them, a rack of brandy snifters, and the water running from the faucet.

“Two mojitos and a strawberry daiquiri,” Kip said when she finally got close enough to yell at.

“Who’s the strawberry daiquiri for,” I asked.

“Me. They’re all for me. Why, did you want a drink too?”

I did, so I ordered the special–a pink cocktail called a Flamingo. Each Flamingo came in a tall Flamingo-shaped glass that couldn’t stand up by itself, so you had to keep holding it until you finished it. Once you were done, you could lay the glass on its side on the counter and watch it leak its remaining pink guts out onto the tile. This act was, I gathered from Kip, a kind of rite of passage at Zygote.

“This is a very fancy place,” I said to no one in particular.

“You should have seen it before the gang fights,” the bartender said before walking back to the snifter rack. I had high hopes she would eventually get around to filling our order.

“Gang fights?”

“Yes,” Kip said. “Gang fights. Used to be big old gang fights in here every other week. They trashed the place several times.”

“It’s like there’s this whole seedy underbelly to Boulder that I never knew existed.”

“Oh, this is nothing. It goes much deeper than this. You haven’t seen the seedy underbelly of this place until you’ve tried to convince a bunch of old money hippies to finance your mass-produced elevator-sized vaporizer. You haven’t squinted into the sun or tasted the shadow of death on your shoulder until you’ve taken on the Bicycle Triads of North Boulder single-file in a dark alley. And you haven’t tried to scratch the dirt off your soul–unsuccessfully, mind you–until you’ve held all-night bargaining sessions with local black hat hacker groups to negotiate the purchase of mission-critical zero-day exploits.”

“Well, that may all be true,” I said. “But I don’t think you’ve done any of those things either.”

I should have known better than to question Kip’s credibility; he spent the next fifteen minutes reminding me of the many times he’d risked his life, liberty, and (nonexistent) fortune fighting to suppress the darkest forces in Northern Colorado in the service of the greater good of mankind.

After that, he launched into his standard routine of trying to get me to buy into the latest round of his inane startup ideas. He told me, in no particular order, about his plans to import, bottle and sell the finest grade Kazakh sand as a replacement for the substandard stuff currently found on American kindergarten sandlots; to run a “reverse tourism” operation that would fly in members of distant cultures to visit disabled would-be travelers in the comfort of their own living rooms (tentative slogan: if the customer can’t come to Muhammad, Muhammad must come to the customer); and to create giant grappling hooks that could pull Australia closer to the West Coast so that Kip could speculate in airline stocks and make billions of dollars once shorter flights inevitably caused Los Angeles-Sydney routes to triple in passenger volume.

I freely confess that my recollection of the finer points of the various revenue enhancement plans Kip proposed that night is not the best. I was a little bit distracted by a woman at the far end of the bar who kept gesturing towards me the whole time Kip was talking. Actually, she wasn’t so much gesturing towards me as gently massaging her neck. But she only did it when I happened to look at her. At one point, she licked her index finger and rubbed it on her neck, giving me a pointed look.

After about forty-five minutes of this, I finally worked up the courage to interrupt Kip’s explanation of how and why the federal government could solve all of America’s economic problems overnight by convincing Balinese children to invest in discarded high school football uniforms.

“Look,” I told him, pointing down to the other side of the bar. “You see? This is why I don’t go to bars any more now that I’m married. Attractive women hit on me, and I hate to disappoint them.”

I raised my left hand and deliberately stroked my wedding band in full view.

The lady at the far end didn’t take the hint. Quite the opposite; she pushed back her bar stool and came over to us.

“Christ,” I whispered.

Kip smirked quietly.

“Hi,” said the woman. “I’m Suzanne.”

“Hi,” I said. “I’m flattered. And also married.”

“I see that. I also see that you have some food in your… neckbeard. It looks like whipped cream. At least I hope that’s what it is. I was trying to let you know from down there, so you could wipe it off without embarrassing yourself any further. But apparently you’d rather embarrass yourself.”

“It’s crème brûlée,” I mumbled.

“Weak,” said Suzanne, turning around. “Very weak.”

After she’d left, I wiped my neck on my sleeve and looked at Kip. He looked back at me with a big grin on his face.

“I don’t suppose the thought crossed your mind, at any point in the last hour, to tell me I had crème brûlée in my beard.”

“You mean your neckbeard?”

“Yes,” I sighed, making a mental note to shave more often. “That.”

“It certainly crossed my mind,” Kip said. “Actually, it crossed my mind several times. But each time it crossed, it just waved hello and kept right on going.”

“You know you’re an asshole, right?”

“Whatever you say, Captain Neckbeard.”

“Alright then,” I sighed. “Let’s get out of here. It’s past my curfew anyway. Do you remember where I left my car?”

“No need,” said Kip, putting on his jacket and clapping his hand to my shoulder. “My hydrofoil’s parked in the Spruce lot around the block. The new warp drive is in. Walk with me and I’ll give you a ride. As long as you don’t mind pushing for the first fifty yards.”

the Neurosynth viewer goes modular and open source

If you’ve visited the Neurosynth website lately, you may have noticed that it looks… the same way it’s always looked. It hasn’t really changed in the last ~20 months, despite the vague promise on the front page that in the next few months, we’re going to do X, Y, Z to improve the functionality. The lack of updates is not by design; it’s because until recently I didn’t have much time to work on Neurosynth. Now that much of my time is committed to the project, things are moving ahead pretty nicely, though the changes behind the scenes aren’t reflected in any user-end improvements yet.

The github repo is now regularly updated and even gets the occasional contribution from someone other than myself; I expect that to ramp up considerably in the coming months. You can already use the code to run your own automated meta-analyses fairly easily; e.g., with everything set up right (follow the Readme and examples in the repo), the following lines of code:

…will perform an automated meta-analysis of all studies in the Neurosynth database that use the term ‘memory’ at a frequency of 1 in 1,000 words or greater, but don’t use the terms wm or working, or words that start with ‘episod’ (e.g., episodic). You can perform queries that nest to arbitrary depths, so it’s a pretty powerful engine for quickly generating customized meta-analyses, subject to all of the usual caveats surrounding Neurosynth (i.e., that the underlying data are very noisy, that terms aren’t mental states, etc.).

Anyway, with the core tools coming along, I’ve started to turn back to other elements of the project, starting with the image viewer. Yesterday I pushed the first commit of a new version of the viewer that’s currently on the Neurosynth website. In the next few weeks, this new version will be replacing the current version of the viewer, along with a bunch of other changes to the website.

A live demo of the new viewer is available here. It’s not much to look at right now, but behind the scenes, it’s actually a huge improvement on the old viewer in a number of ways:

  • The code is completely refactored and is all nice and object-oriented now. It’s also in CoffeeScript, which is an alternative and (if you’re coming from a Python or Ruby background) much more readable syntax for JavaScript. The source code is on github and contributions are very much encouraged. Like most scientists, I’m generally loathe to share my code publicly because I think it sucks most of the time. But I actually feel pretty good about this code. It’s not good code by any stretch, but I think it rises to the level of ‘mostly sensible’, which is about as much as I can hope for.
  • The viewer now handles multiple layers simultaneously, with the ability to hide and show layers, reorder them by dragging, vary the transparency, assign different color palettes, etc. These features have been staples of offline viewers pretty much since the prehistoric beginnings of fMRI time, but they aren’t available in the current Neurosynth viewer or most other online viewers I’m aware of, so this is a nice addition.
  • The architecture is modular, so that it should be quite easy in future to drop in other alternative views onto the data without having to muck about with the app logic. E.g., adding a 3D WebGL-based view to complement the current 2D slice-based HTML5 canvas approach is on the near-term agenda.
  • The resolution of the viewer is now higher–up from 4 mm to 2 mm (which is the most common native resolution used in packages like SPM and FSL). The original motivation for downsampling to 4 mm in the prior viewer was to keep filesize to a minimum and speed up the initial loading of images. But at some point I realized, hey, we’re living in the 21st century; people have fast internet connections now. So now the files are all in 2 mm resolution, which has the unpleasant effect of increasing file sizes by a factor of about 8, but also has the pleasant effect of making it so that you can actually tell what the hell you’re looking at.

Most importantly, there’s now a clean, and near-complete, separation between the HTML/CSS content and the JavaScript code. Which means that you can now effectively drop the viewer into just about any HTML page with just a few lines of code. So in theory, you can have basically the same viewer you see in the demo just by sticking something like the following into your page:

Well, okay, there are some other dependencies and styling stuff you’re not seeing. But all of that stuff is included in the example folder here. And of course, you can modify any of the HTML/CSS you see in the example; the whole point is that you can now easily style the viewer however you want it, without having to worry about any of the app logic.

What’s also nice about this is that you can easily pick and choose which of the viewer’s features you want to include in your page; nothing will (or at least, should) break no matter what you do. So, for example, you could decide you only want to display a single view showing only axial slices; or to allow users to manipulate the threshold of layers but not their opacity; or to show the current position of the crosshairs but not the corresponding voxel value; and so on. All you have to do is include or exclude the various addSlider() and addData() lines you see above.

Of course, it wouldn’t be a mediocre open source project if it didn’t have some important limitations I’ve been hiding from you until near the very end of this post (hoping, of course, that you wouldn’t bother to read this far down). The biggest limitation is that the viewer expects images to be in JSON format rather than a binary format like NIFTI or Analyze. This is a temporary headache until I or someone else can find the time and motivation to adapt one of the JavaScript NIFTI readers that are already out there (e.g., Satra Ghosh‘s parser for xtk), but for now, if you want to load your own images, you’re going to have to take the extra step of first converting them to JSON. Fortunately, the core Neurosynth Python package has a img_to_json() method in the imageutils module that will read in a NIFTI or Analyze volume and produce a JSON string in the expected format. Although I’m pretty sure it doesn’t handle orientation properly for some images, so don’t be surprised if your images look wonky. (And more importantly, if you fix the orientation issue, please commit your changes to the repo.)

In any case, as long as you’re comfortable with a bit of HTML/CSS/JavaScript hacking, the example/ folder in the github repo has everything you need to drop the viewer into your own pages. If you do use this code internally, please let me know! Partly for my own edification, but mostly because when I write my annual progress reports to the NIH, it’s nice to be able to truthfully say, “hey, look, people are actually using this neat thing we built with taxpayer money.”

several half-truths, and one blatant, unrepenting lie about my recent whereabouts

Apparently time does a thing that is much like flying. Seems like just yesterday I was sitting here in this chair, sipping on martinis, and pleasantly humming old show tunes while cranking out several high-quality blog posts an hour a mediocre blog post every week or two. But then! Then I got distracted! And blinked! And fell asleep in my chair! And then when I looked up again, 8 months had passed! With no blog posts!

Granted, on the Badness Scale, which runs from 1 to Imminent Apocalypse, this one clocks in at a solid 1.04. But still, eight months is a long time to be gone–about 3,000 internet years. So I figured I’d write a short post about the events of the past eight months before setting about the business of trying (and perhaps failing) to post here more regularly. Also, to keep things interesting, I’ve thrown in one fake bullet. See if you can spot the impostor.

  • I started my own lab! You can tell it’s a completely legitimate scientific operation because it has (a) a fancy new website, (b) other members besides me (some of whom I admittedly had to coerce into ‘joining’), and (c) weekly meetings. (As far as I can tell, these are all the necessary requirements for official labhood.) I decided to call my very legitimate scientific lab the Psychoinformatics Lab. Partly because I like how it sounds, and partly because it’s vaguely descriptive of the research I do. But mostly because it results in a catchy abbreviation: PILab. (It’s pronounced Pieeeeeeeeeee lab–the last 10 e’s are silent.)
  • I’ve been slowly writing and re-writing the Neurosynth codebase. Neurosynth is a thing made out of software that lets neuroimaging researchers very crudely stitch together one giant brain image out of other smaller brain images. It’s kind of like a collage, except that unlike most collages, in this case the sum is usually not more than its parts. In fact, the sum tends to look a lot like its parts. In any case, with some hard work and a very large serving of good luck, I managed to land a R01 grant from the NIH last summer, which will allow me to continue stitching images for a few more years. From my perspective, this is a very good thing, for two reasons. FIrst, because it means I’m not unemployed right now (I’m a big fan of employment, you see); and secondly, because I’m finding the stitching surprisingly enjoyable. If you enjoy stitching software into brain images, please help out.
  • I published a bunch of papers in 2012, so, according to my CV at least, it was a good year for me professionally. Actually, I think it was a deceptively good year–meaning, I don’t think I did any more work than I did in previous years, but various factors (old projects coming to fruition, a bunch of papers all getting accepted at the same time, etc.) conspired to produce more publications in 2012. This kind of stuff has a tendency to balance out in fairly short order though, so I fully expect to rack up a grand total of zero publications in 2013.
  • I went to Iceland! And England! And France! And Germany! And the Netherlands! And Canada! And Austin, Texas! Plus some other places. I know many people spend a lot of their time on the road and think hopping across various oceans is no big deal, but, well, it is to me, so BACK OFF. Anyway, it’s been nice to have the opportunity to travel more. And to combine business and pleasure. I am not one of those people–I think you call them ‘sane’–who prefer to keep their work life and their personal life cleanly compartmentalized, and try to cram all their work into specific parts of the year and then save a few days or weeks here and there to do nothing but roll around on the beach or ski down frighteningly tall mountains. I find I’m happiest when I get to spend one part of the day giving a talk or meeting with some people to discuss the way the edges of the brain blur when you shake your head, and then another part of the day roaming around De Jordaan asking passers-by, in a stilted Dutch, “where can I find some more of those baby cheeses?”
  • On a more personal note (as the archives of this blog will attest, I have no shame when it comes to publicly divulging embarrassing personal details), my wife and I celebrated our fifth anniversary a few weeks ago. I think this one is called the congratulations, you haven’t killed each other yet! anniversary. Next up: the ten year anniversary, also known as the but seriously, how are you both still alive? decennial. Fortunately we’re not particularly sentimental people, so we celebrated our wooden achievement with some sushi, some sake, and only 500 of our closest friends an early bedtime (no, seriously–we went to bed early; that’s not a euphemism for anything).
  • I contracted a bad case of vampirism while doing some prospecting work in the Yukon last summer. The details are a little bit sketchy, but I have a vague suspicion it happened on that one occasion when I was out gold panning in the middle of the night under a full moon and was brutally attacked by a man-sized bat that bit me several times on the neck. At least, that’s my best guess. But, whatever–now that my disease is in full bloom, it’s not so bad any more. I’ve become mostly nocturnal, and I have to snack on the blood of an unsuspecting undergraduate student once every month or two to keep from wasting away. But it seems like a small price to pay in return for eternal life, superhuman strength, and really pasty skin.
  • Overall, I’m enjoying myself quite a bit. I recently read somewhere that people are, on average, happiest in their 30s. I also recently read somewhere else that people are, on average, least happy in their 30s. I resolve this apparent contradiction by simply opting to believe the first thing, because in my estimation, I am, on average, happiest in my 30s.

Ok, enough self-indulgent rambling. Looking over this list, it wasn’t even a very eventful eight months, so I really have no excuse for dropping the ball on this blogging thing. I will now attempt to resume posting one to two posts a month about brain imaging, correlograms, and schweizel units. This might be a good cue for you to hit the UNSUBSCRIBE button.

a blog about minds, brains, data & stuff