What does this protein do? Ask FEATURE.

(Part of the “Dinner Table Science” series)

genome_text

Photo by several_bees on Flickr

It’s a scenario that’s becoming more and more common in today’s technology-driven science: we discover a new gene or protein, but have nary an inkling of how it contributes to life as we know it. Knowing what proteins do can lead to improved medical treatments, better biofuels, and new insights into biology, but it does little good to discover proteins without studying them. If we’re lucky, the new protein happens to be similar to something we’ve already studied, and so we can guess with some confidence about its function, but what do we do when it’s not? Testing everything we can think of with laboratory experiments is clearly out of the question; there are far too many possibilities, and – let’s face it – far too few graduate students.

Well, you could try asking a computer program. Inspired by the fact that a protein’s amino acid sequence and three-dimensional structure often provide clues about what it does, computational scientists are developing algorithms to predict function for newly discovered proteins. FEATURE, developed by Russ Altman’s group at Stanford University, is one such algorithm. It uses a technique known as supervised machine learning, which derives distinguishing characteristics of a particular kind of object from known examples so that future instances can be classified automatically.

If it looks like a duck…

In FEATURE’s case, the objects are functional sites – specific locations in protein structures associated with behaviors like ion binding or enzymatic reactions. Even a fairly simple function like calcium binding is critical to the cell – calcium regulates the activity of many important proteins, controlling how cells respond to their environment, how neurons fire, and how muscles contract. FEATURE can learn what calcium binding sites look like by comparing examples of sites known to bind calcium to examples that are known to not bind calcium. It can then predict whether a new protein binds calcium. It is essentially the computer equivalent of a laboratory test for calcium binding, without the need for chemical reagents, physical quantities of the protein in question, or hours of human labor.

Got function?

Want to know what FEATURE thinks of your protein? Give it a scan at WebFEATURE. If you’d rather build a custom model for your favorite function, the source code is available, too.

webfeature

Using the FEATURE algorithm, the Altman group has created models for many different protein functions, including calcium and zinc binding. In the works are projects which probe the dynamic nature of protein function – how functional sites change as proteins flex and wiggle –  and efforts to discover new kinds of functional sites without prior knowledge.

Not a miracle cure, but useful

Automated classification has its own drawbacks, of course. FEATURE uses protein structure data to learn the distinguishing properties of functional sites, but only a fraction of proteins have structure data available. More generally, computational tools provide only predictions (some more accurate than others) and so do not eliminate the need for experimental verification.

Despite these problems, algorithms for predicting function are an important part of studying proteins in the post-genomic era. The growing popularity of large-scale projects such as metagenomics (sequencing all the DNA in samples of natural environments such as the oceans and our digestive systems) and structural genomics (solving the structures of all known proteins) means that the rate at which we discover new proteins is increasing much faster than the rate at which we acquire understanding about what they do. Armed with computational tools like FEATURE, we can narrow down the possibilities and generate testable hypotheses, making an initially intimidating task – figuring out what protein X does – more tractable.

1a6r-hits

Note: My Ph.D. work is based on FEATURE.

2 Responses to What does this protein do? Ask FEATURE.

  1. rbaltman says:

    Nice post. We are particularly interested in experimental collaborators who would like to test some of our predictions. We have some very high confidence ones, and even some ideas about how to get $$ to support the experiments. Operators are standing by. russ.altman@stanford.edu

  2. Daniel Jurczak says:

    Now, this sounds extremely interesting.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s