Archive for the 'Genetics' Category


The importance of family (genomes)

Yesterday, a paper on the analysis and interpretation of the genomes of a family of four was released in PLoS Genetics and featured in the Wall Street Journal, spearheaded by Rick Dewey and Euan Ashley. I was fortunate to be involved in this groundbreaking analysis, a logical next step to the clinical interpretation of Steve Quake’s genome last year in the Lancet. Collaborating on this paper got me thinking about analysis of family genomes in the age of GWAS (Genome-Wide Association Studies).

In the linkage studies of the past, researchers focused on families and segregation patterns of alleles to identify genes significantly linked with disease. These studies worked great for rare diseases, as they could focus on a single linked region/gene at a time. But for complex/multigenic diseases, the segregation patterns of a disease are not as clear, and the GWAS community has stepped in to tackle these problems on a larger scale. However, the genetic basis of only a few diseases have been successfully mapped by GWAS (to, say, greater than 50% of the genetic variance explained by the factors identified in the studies), such as age-related macular degeneration, and the bulk of diseases and traits have come up short. For complex diseases, the difficulty is the same as before: with so many unaccounted-for variables, we are back to a needle in a haystack problem. There is a great potential for combining family data with GWAS-based methods: in an analogous method to Sarah Ng and Jay Shendure’s identification of disease genes in rare diseases by exome sequencing, the ability to “subtract out” some of the noise (that may be family-specific) may result in more reliable results. Specifically, an unaffected family member may be used to down-weight the SNPs in common with an affected subject.

Looking at the genomes of the whole family at once in a clinical assessment context (applying results from large studies to a smaller number of individuals) was crucial to this analysis. At the most basic level, simply estimating the error rates is highly aided by the sequencing of multiple family members: knowing that the likelihood of AG and GG parents will have an AA child is vanishingly small gives us a confidence level for the SNP calls we do make. Then, when it comes to assessment of disease risk, analysis of multiple family members demonstrates the exact problem of complex diseases. While both parents may not be at risk for a disease, the exact combination of alleles passed down can confer a greater risk than the average of the parents. It is precisely here that genetic risk has a potential to trump family history in clinical analysis. At present, family history is a great predictor of clinical outcome, as it encapsulates much of the uncharacterized risk conferred by genetics. However, as our understanding of the genetic factors of disease increases, the genetic profile can incorporate something the family history cannot: the precise pattern of allele segregation. Finally, a family analysis can allow for phased genomes, which can inform the presence of “compound heterozygotes,” or cases where both alleles of a gene are affected by 2 different SNPs. While each of these may not be damaging on their own, the combination of both alleles may render both copies of the gene ineffective.

As the availability of genome-wide methods rapidly expanded, analysis of families seemed to go out of fashion for a while. Of course, we will need sophisticated informatics methods to tease out the signal from the noise, and these would not be trivial. However, with the current trends of the cost of genotyping and genome sequencing, a dataset of 100 families with a common disease is not out of sight. Then, of course, the clinical assessment of a family genome is another challenge, to which this paper brings a novel perspective, and it will be fascinating to follow the further development of these methods.


Genomes Unzipped

This morning, Genomes Unzipped launched phase 2 of their website: a dive into the analysis of personal genomics. Today, this began with release of their personal (23andme and Counsyl) genetic data, as well as a snazzy-looking genome browser targeted to personal genetic data. While playing with the site, the same lesson dawned on me that I’ve noticed a number of times before (especially during the Personalized Medicine course): personal genomes are so much more interesting when they are personal. Tools like the genome browser (and their forthcoming analysis code) are instantly more useful, entertaining, and (most importantly) educational/illuminating when exploring ones own genotype data.

On top of this, ensuring open access of data, along with openness of genome research projects, is essential to progress. While consent issues are, of course, extremely important, addition of any phenotype information is crucial to the success of genetic discovery programs: one can only imagine how this would have made the already-powerful 1000 Genomes Project even more powerful. Genome-wide trait-wide association studies, based on open communal analyses, have the potential to transform the landscape of genetics and heritability. So hats off to Daniel MacArthur, Luke Jostins, and the whole Genomes Unzipped team for getting this project moving. I look forward to seeing what seeing comes out of the data and the experience in general.


Stanford Personalized Medicine Course

A few weeks ago, Mike Snyder gave the last lecture of Personalized Medicine and Genomics (Genetics 210) here at Stanford University, a course for which I’ve had the privilege of being a teaching assistant. In this pilot program, Stanford medical and graduate students were taught about the state-of-the-art in personal genetics and given the option to get themselves genotyped. While we are still analyzing course survey data and it has not yet been established if the course will be offered again (in its current form), one thing is clear: everyone involved in the course, from the students to the teachers, from the proponents to the critics, learned something about genetic testing. The San Francisco Chronicle did a great job covering the class before and after, but I thought I’d cover a bit more of the details.

33 students (the TA included) learned something about their personal genetic risks. Whether it was an genetically estimated risk of prostate cancer from the population average 17% to a personal 24% or an “increased-risk” designation for hypertension, these students now understand and can connect with the contribution of their genetics to their personal health. While the concept of an odds ratio could have been presented starting with statistics and ending with an anonymous number that confers a disease risk to an anonymous group of people, instructors Keyan Salari and Euan Ashley presented this concept to students with their own genetic reality, making the results of the students’ analysis real and tangible. When someone sees they have a TT genotype at a locus and the disease is more common in people who have a TT at the same locus (than others who may have an AT or AA genotype), it makes them think critically as to what this means in general, for the population, and to them. Importantly, the students now understand that a genetic test is not a diagnosis. It is a scientifically informed estimate of disease risk, based on the application of published scientific studies (which students were taught to scrutinize and analyze critically, with all the reasons a study like this may fail or at least be incomplete) to a personal genome. While there exist a few conditions for which genetics plays a majority role, most results provided by a DTC genetic testing company (such as 23andme) confer moderate risks. A typical result may involve an increase from a 1/6 chance of getting prostate cancer at some point in life to a 1/4 chance. At first glance, the suggestion that this person is “high risk” for a disease may sound scary to the uninformed. However, if a person were told the risk for prostate cancer for individuals of his race and ethnic background were 24% in his population (while only 16% overall), this would likely not cause undue stress, but he would be more informed and consider earlier screening options.

The same goes for the students response to various drugs. In a lecture on Pharmacogenomics, Russ Altman brought his expertise of gene-drug interactions and made it personal. It’s easy to say “different people respond to drugs differently based on genetics,” but it is not until one sees their own genetics suggesting an increased sensitivity to warfarin that one can pause and say “If I didn’t know my genetic factors for warfarin dosing, I might be prescribed too much (which can cause side effects such as hemorrhaging).” Even if this person is not currently taking warfarin, it is easy to discern which of these two statements are more effective in learning about pharmacogenomics.

These students also learned something about what their DNA can tell them about their ancestry. In Carlos Bustamante’s lecture, running PCA and admixture methods, they observed where they fell on the “genetic map.” Most of the time, this was not news to the students: after all, ancestry is not a particularly anonymous trait. However, seeing the power of these methods to detect differences between populations and separate even an individual’s genome into African, Asian, and European derived sections demonstrated the extent of diversity among individuals who are otherwise 99.93% similar. Of course, some students observed results that were not as straightforward and could only be explained by dissecting the methods employed, a personal foray into scientific analysis.

Along these lines, one particularly important lesson students learned was a scientific look at studies involving genetic information. For instance, students observed the shortcomings of genetic information, such as the inability to significantly predict a fairly heritable trait, height (although it should be noted here that the instructors learned a valuable lesson about science education, that not everything works out as planned; in the course of height prediction of this particular sample, genetics were better able to predict height than the prediction based on parents’ heights, the opposite of the typically reported result). In addition, with Stuart Kim’s perspective on the genomics of aging, the students explored the scientific methods behind these studies, through a closer look at the centenarian prediction paper, which was published while the class was in session and subsequently questioned by the scientific community.

The course ended with Mike Snyder’s vision on the future of personal genomics and what will happen when the cost of a full genome sequence falls to consumer-affordable levels. At this point, an individual may have knowledge about his or her rare variants or the rarest variants, “private” mutations, where annotation and information is not as readily available as studied and annotated SNPs. The students were encouraged to think about this largely unexplored area which will no doubt become integrated with broader aspects of biology and science.

Before the students decided whether to get genotyped or not, they were presented with perspectives (from genetic counselors and ethicists, Kelly Ormond, Louanne Hudgins, Hank Greely, and Mike Grecius) and asked to think critically about the implications of knowing their personal genotype. This was no doubt an important aspect of the course, providing informed consent to the medical and graduate students undertaking to receive information with which they may not have been familiar. Once they made their decision, however, the informing did not end there. The lectures and exercises (and options for genetic counseling) encouraged students to constantly explore their genetics, to truly understand the basis of the information and what it means to them. With students invested in the analysis of their personal genetic data, we hoped to effectively teach the personal nature of genetics. While we have not yet fully analyzed the final effects of the course and the effectiveness of a genotyping option, at the very least, we believe we have dispelled some of the fears and controversy around genetic testing for these students. And hopefully, the 60 students that took time out of their summer to take an optional elective course learned something about personal genotypes.


Personality and Genetics

These days, it’s rare to come across a published study that describes a negative result. It’s even rarer when lots of people are interested. But keeping in the theme of missing heritability, a report in Biological Psychology was released a few days ago that found no common genetic variants significantly contributing to differences in personality. This should not be surprising for anyone who has been following the ongoing search for heritability. If height (a definitively heritable trait with an easy-to-measure quantitative output) cannot be easily explained by common variants in large studies, a search for genetic associations with less objective trait such as personality is certainly in trouble.

And that’s exactly what happened. “No SNPs reached genome wide significance (α = 7.2*10-8) and the SNP with the lowest p-value for each personality scale explains less than 0.5% of the total variance.” The usability of the psychological tests (and any personality test, for that matter) for a quantitative measure of personality is a contentious topic, discussed here and here by psychologists in the study’s wake. Boiling down an individual’s personality into a series of numbers is not a perfect science (“perfect science” is sort of an oxymoron in itself, isn’t it?) and one that would no doubt change over time (I had friends that took Myers-Briggs tests yearly to look at how the results changed). In any case, even if this sort of information could be easily ascertained, the discovery power will likely be low at current GWAS power levels. Considering personality as a “rare disease” (in one sense, every individual has a basically distinct personality, so it’s the rarest trait around), finding the (likely rare) variants that contribute to personality will indeed become a challenge.

It’s here where larger aggregated studies might come into play. Citizen science groups like DIYgenomics and research snippets/surveys in 23andWe can theoretically open the door to greater amounts of data than any single researcher can afford. Of course, such endeavors are subject to some limitations. Most importantly, traits in these citizen studies are self-reported. This means that self-perception will alter the results (whether consciously or sub-consciously). For instance, if a 23andWe research snippet asked “Do you consider yourself a good person?”, citizens’ perception of themselves will most likely steer the answer in a positive direction (although it may be an interesting experiment in genetic associations with self-perception itself, in observing the individuals who answer “No”). Personality test questions are often more advanced than this one, but may suffer the same problems.

Thanks to Razib Khan for getting this discussion going. It will be interesting to see where the investigation into the genetics of personality goes next.


Response to GAO testimony on DTC Genetic Testing

This was a guest blog post I wrote for Russ Altman over at his blog:

On July 22, 2010, the Government Accountability Office (GAO) released a testimony on Direct To Consumer (DTC) Genetic Testing companies. In the testimony, the GAO sent DNA from 5 anonymous donors to 4 anonymous DTC companies. At times, the results were astonishing, such as claims made by some of the companies taking advantage of ill-informed customers to sell custom supplements “based” on genetic test results. However, the testimony also revealed a fundamental disconnect in communication between science, medicine, and the public: a disconnect that has always existed, but is now being brought to the public eye, as recent technologies have begun to bridge the gap between scientists and consumers.

To preface, it is of course outrageous that anyone interpret a DTC genetic test as a diagnostic test (at least in their current form). Analysis of a personal genome is not a medical test. For the bulk of genetic markers, having a “high risk” allele for a disease is not even close to a diagnosis of the disease. It is simply an indicator that on average, in the particular population chosen by a research study (which are often small populations or populations selected to be enriched for a particular disease), individuals with that particular allele had a higher incidence of the disease in question than individuals without the allele (i.e. the “high risk” allele has a higher odds for the disease than the other). The companies then translate into a overall disease risk, which adjusts the prior probability of getting the disease by this odds ratio. Depending on which studies and genes/alleles a company takes into consideration, this risk may be vary considerably. In any case, the report provides a final probability of getting a disease, which may or may not actually be the same as the actual outcome. Just as an individual can get lung cancer without smoking, one can get diabetes even with a below average risk.

This is not to say all the calculations of the disease risk interpretations of all these companies are flawless (we haven’t verified the math and studies in all these companies), but the fact remains that there are legitimate scientific differences on how to interpret the data. While no particular method is outright “wrong,” there are better and worse ways to analyze results of genetic tests and competition among DTC companies for the highest quality interpretations should become increasingly important. Of course, it is objectively difficult to measure which interpretation is “best,” but this will change as more data become available both in predictive claims and possibilities for validation.

According to the testimony, the Department of Health and Human Services’ Secretary’s Advisory Committee on Genetics, Health, and Society notes that “[practitioners] cannot keep up with the pace of genetic tests and are not adequately prepared to use test information to treat patients appropriately.” While this may be true at present, this need not stop information from genetic tests from entering the clinic. A general practitioner may not be able to keep up with the latest advances in neurosurgery, but that’s where the specialist system thrives. In any case, just as clinicians are expected to demonstrate a basic level of competence in immunology in medical school, genetics must be treated the same way. Here at Stanford, a pilot project was launched to teach medical students about the field of genetic testing in an interactive classroom setting with state-of-the-art methods for analysis of personal genotypes.

Deceptive marketing, including “personalized supplements” (allegedly with celebrity endorsements) and drugs that may “repair damaged DNA” (allegedly called “epigenetics”), to say nothing of surreptitious testing and scientifically nonsensical claims, are inexcusable and irresponsible practices for any company, not limited to this particular market. However, the delicate matter of genetic testing and its use as a clinical guidance tool is a concept that must be explored further. The GAO uses the phrase “misleading test results”: it should be noted that while the current implementation of the reporting of test results may be in certain ways misleading, the framework of genetic testing is not in itself misleading. Proper interpretation is based in the same mathematical and biological context as much of today’s medicine. There is great potential for the use of genetic tests in the clinic, so long as results are carefully interpreted. While this was often limited to geneticists in the past, we hope that this can be soon accomplished by physicians and the public.

RSS Recent Publications


  • RT @beryl_bbc: Low pext selectively filter false positive variants in gnomAD, while leaving pathogenic variants in ClinVar largely untouche… 3 days ago
  • RT @beryl_bbc: Our new gnomAD companion preprint, on the importance of transcript expression for interpretation. In gnomAD we see variants… 3 days ago
  • Finally adapting LOFTEE to work with GRCh38, only to find there are no GERP scores available! Anyone know where the…… 2 weeks ago
  • RT @lukwam: The Broad Institute in Cambridge has a Super Bowl wager with The Broad in LA. If Patriots win, @joanneheyler has to come here a… 2 weeks ago
  • RT @cureffi: Thanks! One way to think about it is, gnomAD now contains more than the square root of Earth's population (!), so if we haven'… 3 weeks ago