No comment

At the risk of beating the issue to death, I offer yet another post on the question, “why don’t scientists comment on scientific articles?” Previous reflections stood within the larger context of scientific impact and article-level metrics, and I’ve also attempted some superficial analysis of commenting behavior at PLoS, BMJ, and BMC. More recently (and this is why the topic is on my mind again), a room full of bright minds at the PLoS Forum (including Cameron Neylon and Jon Eisen) scratched their heads over it and came up with pretty much the same conclusion as everyone else who’s ever thought about the problem — the costs simply outweigh the benefits.

The costs, in principle, are minimal. You might need to register for an account at the journal website and be logged on, but then all that’s needed is little more than what most of us already do multiple times a day with our email — type into a box and click “submit”. (In practice, there may be nonsensical, hidden costs that make you wonder what the folks at those journals were smoking.) So the perception that the cost-benefit equation doesn’t work speaks more to the lack of benefit than anything else.

Photo by jamesclay on flickr

Read more of this post

A brief analysis of commenting at BMC, PLoS, and BMJ

As announced on FriendFeed and Twitter, a writing collaboration between me and the inimitable Cameron Neylon has just been published at PLoS Biology, “Article-level metrics and the evolution of scientific impact”! (Loosely based on a blog post from several months ago.)

One of the many issues Cameron and I touched on was the problem of commenting. Most people probably aren’t aware of the problem; after all, commenting is alive and well on the internet in most places you look! But click over to PLoS or BioMed Central (BMC) and the comment sections are the digital equivalent of rolling tumbleweed.

As we mention briefly in the article, comments have great potential for improving science. For one thing, they’re a form of peer review, but without the month-long wait and seemingly arbitrary review criteria. Readers, authors, and other evaluators can also get a sense of what people think about the article. The ideal is certainly tantalizing — vigorous, rigorous debates over the finer scientific points as well as the overarching conclusions with participation both from experts in the field as well as informed laypeople, always with intelligence and civility!!!1!11!!one!! But let’s not kid ourselves — the worst-case scenario is all too easy to imagine and would probably look something like the discussions over at YouTube.

And this would be positively urbane. (From PhD comics)

Read more of this post

In memoriam: Warren DeLano



PyMOL has starred in many journal covers

On Tuesday, November 3rd, the scientific community suffered a great loss with the passing of Warren DeLano. Most people know him as the creator of PyMOL, a popular and extremely powerful molecular visualization tool, but most – including myself, until recently – may not know all of the other unique qualities that made Warren a mentor, collaborator, inspiration and friend to many. And by making PyMOL open source, Warren demonstrated his generosity and ensured that his work would continue to help future generations of scientists.
Read more of this post

The evolution of scientific impact

Photo by cudmore on Flickr

In science, much significance is placed on peer-reviewed publication, and for good reason. Peer review, in principle, guarantees a minimum level of confidence in the validity of the research, allowing future work to build upon it. Typically, a paper (the current accepted unit of scientific knowledge) is vetted by independent colleagues who have the expertise to evaluate both the correctness of the methods and perhaps the importance of the work. If the paper passes the peer-review bar of a journal, it is published.

Measuring impact

For many years, publications in peer-reviewed journals have been the most important measurement of someone’s scientific worth. The more publications, the better. As journals proliferated, however, it became clear that not all journals were created equal. Some had higher standards of peer-review, some placed greater importance on perceived significance of the work. The “impact factor” was thus born out of a need to evaluate the quality of the journals themselves. Now it didn’t just matter how many publications you had, it also mattered where.

But, as many argue, the impact factor is flawed. Calculated as the average number of citations per “eligible” article over a specific time period, it is highly inaccurate given that the actual distribution of citations is heavily skewed (an editorial in Nature by Philip Campbell stated that only 25% of articles account for 89% of the citations).  Journals can also game the system by adopting selective editorial policies to publish articles that are more likely to be cited, such as review articles. At the end of the day, the impact factor is not a very good proxy for the impact of an individual article, and to focus on it may be doing science – and scientists – a disservice.

In fact, any journal-level metric will be inadequate at capturing the significance of individual papers. While few dispute the possibility that high journal impact factors may elevate some undeserving papers while low impact factors may unfairly punish perfectly valuable ones, many still feel that the impact factor – or more generally, the journal name itself – serves as a useful, general quality-control filter. Arguments for this view typically stem from two things: fear of “information overload”, and fear of risk. With so much literature out there, how will I know what is good to read? If this is how it’s been done, why should I risk my career or invest time in trying something new?

What is clear to me is this – science and society are much richer and more interconnected now than at any time in history. There are many more people contributing to science in many more ways now than ever before. Science is becoming more broad (we know about more things) and more deep (we know more about these things). At the same time, print publishing is fading, content is exploding, and technology makes it possible to present, share, and analyze information faster and more powerfully.

For these reasons, I believe (as many others do) that the traditional model of peer-reviewed journals should and will necessarily change significantly over the next decade or so.

Article-level metrics at PLoS

The Public Library of Science, or PLoS, is leading the charge on new models for scientific publishing. Now a leading Open Access publisher, PLoS oversees about 7 journals covering biology and medicine as well as PLoS ONE, on track to become the biggest single journal ever. Papers submitted to PLoS ONE cover all areas of science and medicine and are peer-reviewed only to ensure soundness of methodology and science, no matter how incremental. So while almost every other journal makes some editorial judgment on the perceived significance of papers submitted, PLoS ONE does not. Instead, PLoS ONE leaves it to the readership to determine which papers are significant through comments, downloads, and trackbacks from online discussions.

Now 2 1/2 years old, PLoS ONE boasts thousands of articles and a lot of press. But what do scientists think of it? Clearly, enough think highly of it to serve on its editorial board or as reviewers, and to publish in it. Concerns that PLoS ONE constituted “lite” peer review seem largely unfounded, or at least outdated. Indeed, there are even tales of papers getting rejected from Science or Nature because of perceived scientific merit, getting published in PLoS ONE, and then getting picked up by Science and Nature’s news sections.

Yet there is still feeling among some that publishing in PLoS ONE carries little or no respectability. This is due in part to a misconception of how the peer review process at PLoS ONE actually works, but also in part because many people prefer an easy label for a paper’s significance. Cell, Nature, Science, PLoS Computational Biology – to most people, these journals represent sound science and important advances. PLoS ONE? It may represent sound science but it’s up to the reader to decide whether any individual paper is important.

Why is there such resistance to this idea? One reason may be tied to time and effort to impact: while citations always have taken some time to build up, a journal often provides a baseline proxy for the significance of a paper. A publication in Nature on your CV is an automatic feather in your cap, and easy for you and for your potential evaluators to judge. Take away the journal, and there is no baseline. For some, this is viewed as a bad thing; for others, however, it’s an opportunity to change how publications – and people – are evaluated.

Whatever the zeitgeist in particular circles, PLoS is clearly forging ahead. PLoS ONE’s publication rates continue to grow, such that people will eventually have to pay attention to papers published there even if they pooh-pooh the inclusive – but still rigorous – peer review policy. Recently, PLoS announced article-level metrics, a program to “provide a growing set of measures and indicators of impact at the article level that will include citation metrics, usage statistics, blogosphere coverage, social bookmarks, community rating and expert assessment.” (This falls under the broader umbrella of ‘post-publication peer review’.) Just how this program will work is a subject of much discussion, and certain metrics may need a lot of fine-tuning to prevent gaming of the system, but the growing consensus, at least among those discussing it online, is that it’s a step in the right direction.

Essentially, PLoS believes that the paper itself should be the driving force for significance, not the vehicle it’s in.

The trouble with comments

A major part of post-publication peer review such as PLoS’s article-level metrics is user comments. In principle, a lively and intelligent comment thread can help raise the profile of the article and engage people – whether it be other scientists or not – in a conversation about the science. This would be wonderful, but it’s also wishful thinking; as anyone who’s read blogs or visited YouTube knows, comment threads devolve quickly unless there is moderation.



For community-based knowledge curation efforts (think Wikipedia), there is also a well-known 90-9-1 rule: 90% of people merely observe, 9% make minor or only editorial contributions, and 1% are responsible for the vast majority of original content. So if your audience is only 100 people, you’ll be lucky if even one of them contributes. Indeed, experiments with wiki-based knowledge efforts in science have been rocky at best, though things seem to getting better. The big question remains:

But will the bench scientists participate? “This business of trying to capture data from the community has been around ever since there have been biological databases,” says Ewan Birney of the European Bioinformatics Institute in Hinxton, UK. And the efforts always seem to fizzle out. Founders enthusiastically put up a lot of information on the site, but the ‘community’ — either too busy or too secretive to cooperate — never materializes. (From a news feature in Nature last September on “wikiomics”.)

Thus, for commenting on scientific articles, we have essentially two problems: encouraging scientists to comment, and ensuring that the comments have some value. An experiment on article commenting on Nature several years ago was deemed a failure due to lack of both participation and comment quality. Even now, while many see the fact that ~20% of PLoS articles have comments as a success, others see it as a inadequate. Those I’ve talked to who are skeptical of the high volume nature of PLoS ONE tend also to view their comments on papers to be a highly valuable resource, one not to be given away for free in public but disclosed in private to close colleagues or leveraged for professional advancement through being a reviewer.

Perhaps the debate simply reflects different generational mindsets. After all, people are now growing up in a world where the internet is ubiquitous, sharing is second-nature, and almost all information is free. Scientific publishing is starting to change, and so it is likely that current incentive systems will change, too. Yet while the gulf will eventually disappear, it is perhaps at its widest point now, with vast differences in social norms, making any online discourse potentially fraught with unnecessary drama. As Bora Zivkovic mentions in a recent interview,

It is not easy, for a cultural reason, because a lot of scientist are not very active online and also use the very formalised language they are using in their papers. People who have been much more active online, often scientists themselves, they are more chatting, more informal. If they don’t like something they are going to say it in one sentence, not with seventeen paragraphs and eight references. So those two kinds of people, those two communities are eyeing each other with suspicion, there’s a clash of cultures. The first group sees the second group as rude. The second group views the first group as dishonest. I think it will evolve into something in the middle, but it will take years to get there.

When people point to the relative lack of comments on scientific papers, it’s important to point out the fact that online commenting has not been around in science for very long. And just as it takes time for citations to start trickling in for papers, it takes time to evaluate a paper in the context of its field. PLoS ONE is less than three years old. Bora notes, “It will take a couple of years, depends on the area of science until you can see where the paper fits in. And only then people will be commenting, because they have something to say.”

Brush off your bullshit detector

The last argument I want to touch on is that of journals serving as filter for information. With millions of articles published every year, it can seem a daunting task keeping up with the literature in your field. What should you read? In a sense, a journal is a classifier, taking in article submissions and publishing what it thinks are good and important papers. As with any classifier, however, performance varies, and is highly dependent on the input. Still, people have come to depend on journals, especially ones with established reputations, to provide this service.

Now even journals have become too numerous for the average researcher to track (hence crude measures like the impact factor). So when PLoS ONE launched, some assumed that it would consist almost entirely of noise and useless science, if it could be considered science at all. I think it’s clear that that’s not the case; PLoS ONE papers are indeed rigorously peer-reviewed, many PLoS ONE papers have already had great impact, and people are publishing important science there. Well, they insist, even if there’s good stuff in there, how am I supposed to find what’s relevant to me out of the thousands of articles they publish every year? And how am I supposed to know whether the paper is important or not if the editors make no such judgment?

Here, I would like to point out the many tools available for filtering and ranking information on the web. At the most basic level, Google PageRank might be considered a way to predict what is significant and relevant to your search terms. But there are better ways. Subscribing to RSS feeds (e.g. through GoogleReader) makes scanning lots of article titles quick and easy. Social bookmarking and collaborative filtering can suggest articles of interest based on what people like you have read. And you can directly tap into the reading lists of colleagues by following them on various social sharing services like Facebook, FriendFeed, Twitter, and paper management software like Mendeley. I myself use a loose network of friends and scientific colleagues on FriendFeed and Twitter to find interesting content from journals, news sites, and blog posts. The bonus is that you also interact with these people, networking at its most convenient.

The point is that there is a lot of information out there, you have to deal with it, and there are more and more tools to help you deal with it. It’s no longer sufficient to depend on only one filter, and an antiquated one at that. It may also be time to take PLoS’s lead and start evaluating papers on their own. Yes, it takes a little more work, but I think learning how to evaluate papers critically is a valuable skill that isn’t being taught enough. In a post about the Wyeth ghost-writing scandal, Thomas Levenson writes:

… the way human beings tell each other important things contains within it real vulnerabilities.  But any response that says don’t communicate in that way doesn’t make sense; the issue is not how to stop humans from organizing their knowledge into stories; it is how to build institutional and personal bullshit detectors that sniff out the crap amongst the good stuff.

From nitot on Flickr

From nitot on Flickr

Although Levenson was writing about the debate surrounding science communication and the media, I think there’s a perfect analogy to new ways of publishing. Any response that says don’t publish in that way doesn’t make sense; the issue is not how to stop people from publishing, it is how to build personal bullshit detectors – i.e. filters. People should always view what they read with a healthy dose of skepticism, and if we stop relying on journals, or impact factors, or worse to do all of our vetting for us, we’ll keep that skill nicely honed. At the same time, we are not in this alone; leveraging a network of intelligent agents – your peers – will go a long way.

So continue leading the way, PLoS. Even if not all of the experiments work, we will certainly learn from them, and keep the practice and dissemination of science evolving for the times.

Can’t attend ISMB 2009? The next best thing.

One of the biggest scientific conferences each year is Intelligent Systems for Molecular Biology (ISMB), put on by the International Society for Computational Biology (ISCB). I had the pleasure of attending the conference in Toronto last year, meeting many familiar names in person and collaborating with a number of them to microblog the sessions. That latter activity was so successful that it caught the eyes of the conference organizers, and we were able to publish a paper in PLoS Computational Biology summarizing the conference.

Even better, the ISCB is embracing microblogging from the outset this year at its ISMB meeting in Stockholm, which is starting this weekend and will run until July 2. They will be auto-generating threads for each talk in the FriendFeed room for live coverage and open commentary and are advertising that fact prominently on the website for those interested in blogging the event. Their actions are in stark contrast to those of Cold Spring Harbor, who recently updated their policies to require bloggers and twitterers to register with CSH beforehand and get advance permission from each presenter they plan on covering.

Now that blogging, microblogging, and even twittering is becoming more commonplace, it behooves conference organizers to have an official policy. Even one that is restrictive is better than no policy, which can result in an awkward backlash when people on both sides are caught unawares. Clearly there is no one-size-fits-all approach, but for conferences that do not deal with sensitive material, an open and even actively encouraging stance such as the ISCB’s is certainly liberating for those of us who are drawn to these kinds of activities.

So if you can’t attend ISMB this year for whatever reason, you (and I) are in luck. They’re freely providing the next best thing – live microblogging and a searchable archive of posts (through FriendFeed). Even if you’ll be physically attending, your experience will be arguably better if you follow the FriendFeed room. Because there’s only one of you, but there are also many others like you.

So check it out, whether you’re there or not, and if you’re there, contribute a post or two! If you’re not there, you can still participate by commenting and asking questions. That’s the beauty of it – the benefits go both ways!

What type of open notebook science are you? (Plus, more logos)

Photo by sararah on Flickr

Photo by sararah on Flickr

A scientist’s notebook is like an artist’s sketchbook mixed with captain’s logs. It can be extremely personal and yet it is the definitive record for both day to day scientific research and for higher-level brainstorming. It can be haphazardly disorganized or meticulously organized. But until electronic media came around, we were stuck with pasting pieces of paper alongside handwritten notes in stacks of bound notebooks or 3-ring binders – a pain not only to store but also to search through when you’re looking for how exactly you ran that particular experiment on that particular sample on that particular equipment.

While it’s not quite the norm yet, these days it’s not uncommon for people to use software such as wikis or journaling programs to record their everyday research activities. This has obvious advantages beyond legibility and saving trees; you can search your notes, link them to data files or figures, and back up multiple copies. You can tag and categorize entries, and the electronic files are automatically timestamped. Wikis, in particular, include versioning, so that any modifications you make to an entry are also recorded and timestamped.

These features should be a boon to any researcher, but there are some important “meta” benefits that can be yours (and ours) if you choose. Making things electronic lowers barriers to access and sharing. If you use a wiki or a blog to record your notes, you can choose to keep them online (useful for accessing from anywhere there’s an internet connection), and further, to make them public. At it’s logical extreme, this translates to “making the entire primary record of a research project publicly available online as it is recorded” along with all raw and processed data, the current definition of open notebook science (ONS) on Wikipedia. A number of scientists and labs practice and advocate ONS, including Jean-Claude Bradley at Drexel, Cameron Neylon at the ISIS Neutron Facility, and Gus Rosania at University of Michigan. They argue that the benefits – both to themselves and to the scientific community at large – far outweigh the risks.

Complete ONS obviously isn’t for everyone, but regardless of whether the practice becomes widely adopted, we should now be able to designate certain labs or notebooks of satisfying the definition of ONS. We can even designate partial ONS – whether all or only part of the content is available, and whether the content is made available immediately or after some time delay (usually for IP or publication purposes). Jean-Claude Bradley has broken down these types of ONS into a set of claims inspired by Creative Commons licenses along with initial logos created by Andy Lang.

The Creative Commons model is great for getting across the terms of your content quickly and unambiguously, so I am a big fan of this initiative. I would love to see more research notebooks online, and to see them displaying badges or banners identifying them as a type of ONS. I got so excited that I started making my own logos, which, happily, Jean-Claude and Andy Lang seem to like:

ons-patch1 ons-patch2

Two potential problems with these logos  that I can think of are:

  • whether it reads as “ons” rather than o-n-s (in which case perhaps uppercase would help),
  • the use of a beaker which could feel exclusive to those not in the experimental or life-sciences,

Incidentally, I made these images in Keynote (Apple’s version of Powerpoint), of all places. I simply couldn’t be bothered to fire up Adobe Illustrator with its bajillions of tools and palettes, and while I had to fudge a bit to get certain things to look right (my way of coloring in the beaker, for example, is hilariously crude), it was still pretty painless. Who knew?

I’ll be making more official mockups for Andy in the next day or two, so if anyone has additional feedback on these designs (or a different design entirely) I’d love to hear it!

I got a senior scientist blogging!

It’s already all over the intertubes by now, but I figured I should post it myself as well just to preserve it in my own blog archives:

My advisor, Russ Altman, and I won the “Get a senior scientist blogging” challenge sponsored by Nature!

Nature Network announced it today and there’s supposedly a press release as well. We’ve actually known for more than a week but had to keep it secret as they prepared the public announcement.

As our prize, Russ’s blog post on one of his first post-genomic moments will be in the Open Laboratory 2008 anthology and we will both get to go to SciFoo in August. Since we’re basically in Google’s back yard*, two other lucky people will get supported to attend SciFoo as well! Who doesn’t love 2-for-1 deals?

Some are curious, and indeed, so am I – what were some of the other front-runners? How many entries were received? Did the challenge actually get a significant number of scientists to start blogging? Either way, it would be nice to see some of the entrants just to add to my list of good science blogs.

Thanks to my blogging friends for the support, to Russ for taking my suggestion seriously, and to Nature Network for sponsoring the challenge! I’m definitely looking forward to SciFoo.

* At least, I hope to stay in Google’s back  yard. We’ll see how the job search goes. By the way, can I put this on my resume??