The beginning of the end: defending the dissertation, part 1
November 19, 2008 3 Comments
After many years (!?) and several false starts (and false ends), I can finally see the light at the end of the tunnel. In two weeks I’ll be defending my dissertation. Granted, it’s not the rah rah got my plane ticket and a bound copy of my thesis type of defense. My department does a “proposal defense” which takes place 6-9 months before you intend to finish, so it’s both more and less stressful than the traditional defense. I have to say, I think I like this better than the traditional arrangement because it forces you to crystallize your thoughts and be able to discuss them constructively with others. As part of the process, we have to turn in a dissertation proposal 2 weeks prior, and it was – besides a huge pain in the rear – extremely helpful for clarifying my thesis.
My fellow students are always very gracious with their time, so many of them sat through a practice talk and gave feedback. Several hours and plenty of Thai food later, I had many pages of suggestions. The most important advice I received revolved around the following:
- Emphasize the problem you are solving and the context with regards to existing work
- Use demonstrative, motivating examples
- Have a clear structure to all parts of your talk so it is easy to follow
- Help people focus in on what is important in each slide
- Display equations extremely sparingly, use graphics when possible
- Don’t use multiple examples when one clear one will suffice
- Make fonts, axes, points etc as large as you can on graphs/plots
- Return to your outline, specific aims, or framework periodically to re-orient the audience
- Think about what is important for your audience to know, cut out all other detail (you can keep what you cut out in the back of the presentation in case someone asks)
Next week I’ll hopefully have incorporated all of this into a new hopefully 25 minutes shorter (!!) version of the talk (yes it was over an hour…) for another round of feedback.
For anyone interested in what my dissertation is about, here’s an abstract:
Knowledge of protein function is essential for understanding biological processes and mechanisms, which can be manipulated to treat disease or engineer beneficial outputs such as disease therapeutics or biofuels. The emergence of high-throughput biological tools, however, has produced a significant bottleneck between protein identification and functional annotation. Structural genomics projects are generating many novel protein structures with little associated functional knowledge, and so computational function characterization methods that do not rely on strict sequence or structure conservation are needed. In this proposal, I present a method for building 3D models of protein functional sites automatically from sequence motifs, called SeqFEATURE, which we have used to construct a large library of functional site models. In particular, I show that SeqFEATURE performs more robustly than other methods when sequence and structural similarity are low.
Another problem in function prediction stems from the fact that most methods require examples of known functions and do not generalize to new functions. A recent study used unsupervised clustering to group together structurally and chemically similar FEATURE-based protein microenvironments, which could potentially represent novel functions. To annotate these clusters, I developed a set of methods for ranking important terms found in the literature associated with the proteins comprising the cluster. In addition, I have adapted the “neighbor divergence per gene” (NDPG) method to assess functional coherence of protein clusters. Preliminary analyses indicate that functional clusters have much greater functional coherence than random clusters, and that coherence decreases with the amount of signal in the cluster. The NDPG method will be combined with hierarchical clustering to refine and select optimal sub-clusters for annotation.
This work extends existing frameworks in the context of structural genomics: creating a pipeline for rapid construction of robust functional site models that can be applied in high-throughput, and defining an approach by which novel biological functions can be discovered and characterized.