Reflections on ASHG 2010

As conferences go, the American Society of Human Genetics (ASHG) annual meeting is a pretty big deal. Anyone who’s anyone in human genetics is there, and if you want to be someone you better be there, too. And it’s big — this year’s meeting saw more than 6,000 attendees spread throughout a gigantic convention center that spanned four square blocks in the heart of Washington, D.C. Academics, publishers, clinicians, policy wonks, and industry reps staked out their territory among an endless sea of posters, eye-popping demo booths, and cavernous session halls. The international meeting for bioinformatics that I’ve gone to the past seemed quaint by comparison.

At bioinformatics conferences, the common theme is computational methods, applied to a wide variety of topics. At a conference like ASHG, the common theme is human genetics, probed and interpreted with a variety of methods. But even the topic is breathtakingly broad. Sessions covered complex disease, non-coding RNAs, methylation, ethical/social/legal/education issues surrounding genomic research and genetic testing, mouse models, high-throughput sequencing, population and evolutionary genetics, pharmacogenetics, cilia, computational methods, and Mendelian disorders, to name just a few.

I made my first visit to ASHG this year as part of a small contingent from 23andMe*, a direct-to-consumer genomics company. Although I missed a good portion of the conference due to my schedule, some of my colleagues took notes on sessions that I missed, and ample coverage of many of the sessions could be had by following the Twitter hashtag #ashg2010. The following summaries and reflections represent a composite of tweets, other people’s notes, and my personal notes and impressions.

The meeting kicked off with several keynote talks on Tuesday evening. The two that got the most attention were Eric Lander’s retrospective on the fruits of the Human Genome Project (HGP) and a talk about the ENCODE project by John Stamatoyannopoulos. In Lander’s retrospective, he painted a picture of our knowledge of human genetics before, during, and after the HGP:

    • 20 years ago, we knew about 70 loci for Mendelian diseases. Ten years ago, we knew about 1300. Today, we know almost 3000.
    • 20 years ago, we knew one locus linked to common disease (HLA). Ten years ago, we knew about 25. Today, we know of about 1100 loci linked to about 165 common diseases and traits.
    • 20 years ago, we knew 12 loci associated with cancer. Ten years ago, we knew about 80. Today, we know 240.

We also know now that every person carries about 150 genetic variants across 1% of genes, unique to his or her genome, that change the amino acid sequence of a protein. (Clearly they don’t all cause dramatic problems, but it’s very likely that some of them have an impact on our individual health.)

Photo by jwhitesmith

Lander discussed the rise and fall of GWAS — genome-wide association studies — which lost some of their cachet when it became clear that the vast majority of associations resulting from such studies were for genetic variants that had very small effects on phenotype. Taken together, these associations thus far explain a relatively small amount of the variance in common diseases and traits, spawning the criticism of “missing heritability”. Alternative theories sprung up like dandelions in a field planted with the seeds of discontent, copy number variations (CNVs) and rare variants among them.

But Lander was skeptical that rare variants will account for the missing heritability, saying that the heritability explained by common variants is still increasing as we include the long tail of small effects, and that it will be challenging ever to find all of the heritability for common, complex diseases. Even if genome-wide association methods manage to find all loci involved, if there are multiple ways to develop the disease, those loci may never explain more than that fraction of the heritability, unless we learn how to detect and quantify epistatic interactions between loci on a large scale.

Stamatoyannopolos followed Lander’s talk with an overview of the ENCODE project, which is tasked with building “a comprehensive parts list of the functional elements of the human genome.” He presented some interesting statistics regarding the distribution of GWAS hits throughout the genome: 47% are intronic, 7% are in coding regions, 2% are in promoter regions, 10% are within 50 kb of a gene, and the rest are farther away. The distribution suggests that much of the effect of common variants on disease is through regulatory pathways rather than changes to proteins. The UCSC Genome Browser now supports an ENCODE track that lets you explore this data.

The rest of the conference was a bit of a whirlwind given the ridiculous number of parallel sessions. Some were so popular (“Statistical Analysis of Human Sequence Variation”) that it was standing room only and security was running crowd control. Even though I couldn’t make it to most of the sessions, I was able to get a sense of what the popular themes were in a subset of the attendees through keyword analysis of the 1,500 or so tweets associated with the conference:

* includes word variations such as singulars or plurals

Clearly, sequencing — particularly exome sequencing and the 1000 Genomes project — disease risk, rare variants, gene expression, and genetic mutations were hot topics, at least among the Twitterati. GWAS was a relatively less popular topic, and Jim Evans was the unexpected dark horse.

This word cloud excludes some core Twitter usage, though: RTs (“retweets”) and mentions of specific Twitter users. If we include them in the mix, this is what we get:

More than 1/3 of the tweets (580/1548) associated with the conference were “retweets” or repeats of original tweets. And in almost half of cases, people were retweeting @dgmacarthur (218 RTs). @larry_parnell, @lukejostins, @Genetics_Blog, and @delahar followed not that closely behind (30-50 RTs each).

How did that compare to actual tweetput? Well, @dgmacarthur generated a lot, but not the most number of tweets. That honor went to @bullymom2 (105 tweets), followed by runner-up @bachinsky (97 tweets), while @dgmacarthur got the bronze (91 tweets). As for myself, I fell solidly in the middle with 29 tweets (if you truncate the distribution at a minimum of 10 tweets per person; there is, unsurprisingly, a long tail of minimal tweeters).

Tweets are obviously a biased sample but it’s a fun window into what at least some of the ASHG attendees were interested in. And if you went to the “tweetup” — a casual meetup of mostly Twitter-savvy conference-goers — you got to see a little further (and got to connect @ to face!). It’s also interesting to compare someone’s tweetput to how often they were retweeted, to get a crude estimate of Twitter “influence” or quality of tweets. @dgmacarthur likely had the highest RT/T score (and well-deserved); I think mine was closer to 1, which is probably respectable.

The only other part of the conference I’ll comment on was the industry exhibits. To my rookie eyes, they were ridiculous. Roche basically transplanted entire lab benches complete with equipment. Others, like Genzyme and Pacific BioSciences, had their own space-age lounges set up where you could essentially be fully surrounded by their trappings and wares. Amidst all of this, the booth set up by Expression Analysis was a welcome respite. In exchange for making your mark on a communal mural, they donated $70 on your behalf to a local health organization. The finished mural will be displayed at the recipient facility for the donations. Not only was it calming to flex some artistic muscle amidst the hustle and bustle of the exhibit hall, it felt good to contribute to a good cause.

For more accounts of the conference, see Luke Jostin’s series of posts (1, 2, 3, 4, and 5), Larry Parnell’s notes (1, 2, 3, and 4), and Stephen Turner’s notes from the 1000 Genomes tutorial session.

** The content and views expressed in this post are mine alone and do not represent the views of 23andMe.

6 Responses to Reflections on ASHG 2010

  1. Elisabeth (aka bullymom2) says:

    Gosh, I am blushing. I seriously doubt that my tweets were as informative as dgmacarthur’s. I’m still learning to distill my thoughts (and the talks I go to) into fewer than 140 characters.

  2. Great post, Shirley – and great meeting you at the meeting.

    This is going to make me sound churlish, but I thought the number of RTs in the channel was a serious problem, as it introduced massive redundancy. Several people at various stages went on retweeting sprees and filled up the hashtag with old tweets – not much fun for those trying to follow other sessions at the meeting.

    The only solution I can see is that people actively break the hashtag when retweeting, and I think we should try to encourage this for future meetings – just rewrite #ashg 2010 as “ASHG 2010” so it still carries the required context, but doesn’t pollute the hashtag stream. I’ll certainly be doing this from now on.

    • shwu says:

      I might argue that this purpose isbetter served not by breaking the hashtag, but by using better tools for filtering out RTs for those who don’t want to see them. There is some value in the RTs at least from a data perspective — which thoughts or findings did people find interesting enough to RT, and in what order? And as for the users people are RTing, well, it gives a good indication of who to follow on Twitter if you’re interested in that space.

      Of course, maybe the most efficient solution is as you say, and the data nerds can get around the broken hashtag with additional searches. ;)

  3. Hi Shirley,

    Interesting points, which I’ve been mulling over the last few days. I think I would agree with you if the majority of people had easy access to tools that allowed them to filter RTs out of the stream, but that isn’t the case – so for the majority of people following via Tweetdeck or the Twitter website, leaving the hashtag intact will inevitably fill up the channel with redundancy.

    Your analysis above illustrates the usefulness of leaving the hashtag intact for us data nerds – but ultimately (and rather reluctantly) I think we should probably place the needs of the majority of users over the needs of us nerds. :)

    Of course, one solution would be to come up with a consistent “broken hashtag” syntax – so prior to a conference we could specifiy “#ashg2010 for primary tweets, ASHG 2010 for RTs” – but the probability of large numbers of people following such guidelines is vanishingly low. In fact, merely encouraging people to break the hashtag will be hard enough…

  4. Pingback: leptin green

Leave a comment