How to write a bioinformatics research paper
September 30, 2008 11 Comments
A while ago, I posted my advisor’s take on the anatomy of a Ph.D. thesis. This post will be similar, except it tackles another sometimes intimidating task faced by graduate students – writing a paper. Again, this is my advisor’s take on the process, and given his experience and panache I’m inclined to agree with it; however, I myself don’t tend to follow guidelines terribly well so I’ll just say that this is a somewhat idealized process that usually becomes messier in practice. The hardest part is getting started, though, and then having a guideline is great for motivation. Though this example is geared towards life sciences/biomedicine/bioinformatics type papers, much of it should still be applicable to other fields.
Before you write: create an “elevator pitch”
Formulate a one liner describing the main point of the paper. It should be specific, interesting, and related to how the paper contributes to the field. (A one liner is good to have in your pocket for a lot of other things too, to answer questions like “What do you do / what is [bioinformatics]?” and “What is your research on?”)
Anatomy of a paper (not necessarily in the order you’ll write)
Abstract (7-10 sentences)
The abstract presents the logic of the paper. Because this is the first and often only part of the paper most people will read, it is extremely important to write it well.
- Sentence 1: Describe the important unsolved problem.
- Sentence 2: Emphasize the challenge/unsolvedness. (For grants and certain papers, this is known as the “people dying” sentence.)
- Sentence 3: Describe the critical sub-problem of interest.
- Sentence 4: Describe the opportunity presented. (This is the “sense of hope” sentence.)
- Sentences 5-6: Briefly summarize the methods.
- Sentences 7-8: Briefly summarize the results, including a few exact numbers or findings.
- Sentence 9: Describe the specific contribution this research makes to the field.
This is essentially a scholarly review of previous work, which involves scouring the literature to present the background to your problem and any relevant research that has been done previously by you or others. You should provide a deep analysis of the overall problem and challenges, describe other approaches to the problem besides yours, describe the problems still unsolved, and what potential there is to solve them (aka your research). Provide a very brief overview of what you will be describing in the methods and telegraph your most important results.
Be generous with references – you can never be too thorough with your review. And, the folks you reference may very well end up reviewing your paper. Just don’t inflate your reference list with irrelevant work.
(Materials and) Methods
This is what you did, but not necessarily in the order you did it. You should avoid a historical recounting (this is what your lab notes are for) and instead present the final process you followed that would produce your results, in a logical fashion. For example, you might first write about your data sets, any pre-processing on that data, then the specific algorithms you used to manipulate that data, and then the evaluation and analysis of that data.
Importantly, you should never apologize for anything in your Methods sections – just report it, and save the justification/explanation for the Discussion section. Your goal is essentially to explain where the figures came from. Above all, DO NOT INCLUDE RESULTS IN YOUR METHODS SECTION.
This section can be thought of as one big caption for all of your figures. Keep it concise, make sure to refer to your figures and tables whenever relevant, and try to have the flow mirror the flow of the Methods section. Above all, DO NOT DISCUSS YOUR RESULTS. Just report the results and save the whys and maybes for the Discussion.
Figures and Tables
These are the meat of your paper, and often the only other thing that people will look at besides the abstract. It therefore behooves you to spend time making them clear, useful, and aesthetically pleasing. They should be uncluttered but easy to interpret – make sure your axes and relevant data points are labeled and provide a legend if there are multiple types of data. You should also be sure that your figures can be read in grayscale as well, as not everyone prints in color (though the web makes this less of a problem).
Start designing your figures and tables early as they are your results. See which ones are absolutely necessary and try to go for maximum information with minimum wasted space / reader effort. And as far as looks go, even just ditching the Excel defaults is already a huge improvement.
For captions, you want to go for thought control. Bring the reader’s attention to what you want them to see and use a little spin (e.g. “X clearly outperforms Y…”). Keep it honest, though, of course.
For your discussion, start by listing the key points. Describe the positive aspects of your results first, and then make admission of the negatives with a little bit of spin to paint them in a more positive light – nothing dishonest here, just a discussion of some of the possible reasons for these negatives or why they might not be as negative as first thought. Make sure to justify or explain any controversial or unorthodox choices you may have made in your methods or analysis. At the end you may provide a brief positive summary of the work and a reflection on future work in this area (this can also be used as a Conclusion if applicable).
The writing process
See the graphic at the right for a rough sketch of what the writing process might look like. Basically you want to compose your one liner first, followed by the abstract and the bulk of your figures and tables. This will give you much of the content of the paper as well as the logical flow of the paper. After the methods and results it gets more iterative, and you might find yourself going back to the methods or the results to check up on certain findings or to do additional analysis, which might then change the slant you take in your discussion, etc.
When writing your first draft, try to write it as quickly as possible while addressing the goals of each section. The idea is that you will rewrite it, but for now you want to have as many of your thoughts down on paper as possible.
When revising, look at every sentence both in isolation and relative to previous and subsequent sentences. The former is to ensure the sentence has clarity and voice; the latter is to ensure that the flow and logic of the paper is intact. Use transitional phrases where appropriate to guide the reader through this flow.
Stylistically, you want to use only active voice (“We designed a study” vs “a study was designed”), use “we” as opposed to “I”, and avoid colloquial phrases. Avoid repetition as well, unless it is deliberate. The occasional passive can be acceptable if it breaks up repetition.
Get the point and logic of your paper down first by writing the one-liner and abstract, and then draft the rest as quickly as possible. Make sure you have most of the figures/tables you plan to have before you start writing since these are the meat of your paper. And then revise, revise, revise. Use active voice and pay attention to the overall flow of the paper. Don’t be afraid to make the writing interesting – you’ll make it that much more enjoyable for your reviewers and readers.
Obviously, you should have someone proofreading your manuscript for technical details, but I highly recommend getting a friend or colleague who is a good writer – or at least a native English speaker, if the paper is written in English – to proofread as well and offer grammatical or stylistic advice.