Author Archive for Konrad Karczewski


BCATS 2010

As some of you may know, for the better part of this year, I have been involved in the organization of BCATS 2010 (Biomedical Computation at Stanford), a one-day student-run conference on Stanford’s campus. Yesterday, I had the great privilege of being the chair of this interdisciplinary conference, which featured 14 talks, 10 spotlight/bullet talks, and close to 50 posters (abstract book). However, this conference was not your typical themed meeting. Rather than a conference where everyone in attendance was an expert in the specific narrow field of the conference, BCATS attendees all spoke the same language of computational and statistical analysis, but applied these methods to very different problems across biology and medicine.

The day started with Zemin Zhang from Genentech, for whom we are very grateful to have had deliver the first keynote address, on his work in genomics and computational biology. Dr. Zhang brought a perspective on the power of genome sequencing in understanding the complex biological basis of cancer. From there, the first group of student talks focused on computational approaches to the study of systems biology, from analysis of transcriptomics data to an integrative model of a whole cell. The second session focused on analysis of existing datasets to learn about drugs and learning about their effects and interactions with other drugs. This was a particularly interesting session, as the speakers presented work that applied mathematical and statistical methods to a topic that everyone could understand without too much technical knowledge: clinical use of drugs, including things that can go right and things that can go wrong.

In the afternoon, BJ Fregly from University of Florida provided a fascinating look into personalized therapy for osteoarthritis through simulation of arthritis development, rehabilitation treatment, and forces on knee joints and muscles. In the keynote and the session that followed, the talks stressed the importance of biological models for developing our understanding of various biomechanical and biochemical processes. The final session’s talks brought together multiple sources of data for representation and interpretation of clinical data, reminding us that direct application to a clinical setting is never too far off.

All in all, the conference brought together researchers from across the fields of biocomputation in a unified setting. While the experience taught me more than I’d like to know about the logistics of organizing such an event (along with everything that can go wrong at the last minute), as soon as I sat down and started listening to the first talks, I was reminded of the quality of the science that really goes on here at Stanford. On a more personal note, I would like to thank my co-organizers, Rob Tirrell, Jessica Faruque, Amir Ghazvinian, Matt Demers, and Keyan Salari for all their help throughout, as well as our sponsors and volunteers for their support. It was a great day and I look forward to next year’s conference.


Genomes Unzipped

This morning, Genomes Unzipped launched phase 2 of their website: a dive into the analysis of personal genomics. Today, this began with release of their personal (23andme and Counsyl) genetic data, as well as a snazzy-looking genome browser targeted to personal genetic data. While playing with the site, the same lesson dawned on me that I’ve noticed a number of times before (especially during the Personalized Medicine course): personal genomes are so much more interesting when they are personal. Tools like the genome browser (and their forthcoming analysis code) are instantly more useful, entertaining, and (most importantly) educational/illuminating when exploring ones own genotype data.

On top of this, ensuring open access of data, along with openness of genome research projects, is essential to progress. While consent issues are, of course, extremely important, addition of any phenotype information is crucial to the success of genetic discovery programs: one can only imagine how this would have made the already-powerful 1000 Genomes Project even more powerful. Genome-wide trait-wide association studies, based on open communal analyses, have the potential to transform the landscape of genetics and heritability. So hats off to Daniel MacArthur, Luke Jostins, and the whole Genomes Unzipped team for getting this project moving. I look forward to seeing what seeing comes out of the data and the experience in general.


Stanford Personalized Medicine Course

A few weeks ago, Mike Snyder gave the last lecture of Personalized Medicine and Genomics (Genetics 210) here at Stanford University, a course for which I’ve had the privilege of being a teaching assistant. In this pilot program, Stanford medical and graduate students were taught about the state-of-the-art in personal genetics and given the option to get themselves genotyped. While we are still analyzing course survey data and it has not yet been established if the course will be offered again (in its current form), one thing is clear: everyone involved in the course, from the students to the teachers, from the proponents to the critics, learned something about genetic testing. The San Francisco Chronicle did a great job covering the class before and after, but I thought I’d cover a bit more of the details.

33 students (the TA included) learned something about their personal genetic risks. Whether it was an genetically estimated risk of prostate cancer from the population average 17% to a personal 24% or an “increased-risk” designation for hypertension, these students now understand and can connect with the contribution of their genetics to their personal health. While the concept of an odds ratio could have been presented starting with statistics and ending with an anonymous number that confers a disease risk to an anonymous group of people, instructors Keyan Salari and Euan Ashley presented this concept to students with their own genetic reality, making the results of the students’ analysis real and tangible. When someone sees they have a TT genotype at a locus and the disease is more common in people who have a TT at the same locus (than others who may have an AT or AA genotype), it makes them think critically as to what this means in general, for the population, and to them. Importantly, the students now understand that a genetic test is not a diagnosis. It is a scientifically informed estimate of disease risk, based on the application of published scientific studies (which students were taught to scrutinize and analyze critically, with all the reasons a study like this may fail or at least be incomplete) to a personal genome. While there exist a few conditions for which genetics plays a majority role, most results provided by a DTC genetic testing company (such as 23andme) confer moderate risks. A typical result may involve an increase from a 1/6 chance of getting prostate cancer at some point in life to a 1/4 chance. At first glance, the suggestion that this person is “high risk” for a disease may sound scary to the uninformed. However, if a person were told the risk for prostate cancer for individuals of his race and ethnic background were 24% in his population (while only 16% overall), this would likely not cause undue stress, but he would be more informed and consider earlier screening options.

The same goes for the students response to various drugs. In a lecture on Pharmacogenomics, Russ Altman brought his expertise of gene-drug interactions and made it personal. It’s easy to say “different people respond to drugs differently based on genetics,” but it is not until one sees their own genetics suggesting an increased sensitivity to warfarin that one can pause and say “If I didn’t know my genetic factors for warfarin dosing, I might be prescribed too much (which can cause side effects such as hemorrhaging).” Even if this person is not currently taking warfarin, it is easy to discern which of these two statements are more effective in learning about pharmacogenomics.

These students also learned something about what their DNA can tell them about their ancestry. In Carlos Bustamante’s lecture, running PCA and admixture methods, they observed where they fell on the “genetic map.” Most of the time, this was not news to the students: after all, ancestry is not a particularly anonymous trait. However, seeing the power of these methods to detect differences between populations and separate even an individual’s genome into African, Asian, and European derived sections demonstrated the extent of diversity among individuals who are otherwise 99.93% similar. Of course, some students observed results that were not as straightforward and could only be explained by dissecting the methods employed, a personal foray into scientific analysis.

Along these lines, one particularly important lesson students learned was a scientific look at studies involving genetic information. For instance, students observed the shortcomings of genetic information, such as the inability to significantly predict a fairly heritable trait, height (although it should be noted here that the instructors learned a valuable lesson about science education, that not everything works out as planned; in the course of height prediction of this particular sample, genetics were better able to predict height than the prediction based on parents’ heights, the opposite of the typically reported result). In addition, with Stuart Kim’s perspective on the genomics of aging, the students explored the scientific methods behind these studies, through a closer look at the centenarian prediction paper, which was published while the class was in session and subsequently questioned by the scientific community.

The course ended with Mike Snyder’s vision on the future of personal genomics and what will happen when the cost of a full genome sequence falls to consumer-affordable levels. At this point, an individual may have knowledge about his or her rare variants or the rarest variants, “private” mutations, where annotation and information is not as readily available as studied and annotated SNPs. The students were encouraged to think about this largely unexplored area which will no doubt become integrated with broader aspects of biology and science.

Before the students decided whether to get genotyped or not, they were presented with perspectives (from genetic counselors and ethicists, Kelly Ormond, Louanne Hudgins, Hank Greely, and Mike Grecius) and asked to think critically about the implications of knowing their personal genotype. This was no doubt an important aspect of the course, providing informed consent to the medical and graduate students undertaking to receive information with which they may not have been familiar. Once they made their decision, however, the informing did not end there. The lectures and exercises (and options for genetic counseling) encouraged students to constantly explore their genetics, to truly understand the basis of the information and what it means to them. With students invested in the analysis of their personal genetic data, we hoped to effectively teach the personal nature of genetics. While we have not yet fully analyzed the final effects of the course and the effectiveness of a genotyping option, at the very least, we believe we have dispelled some of the fears and controversy around genetic testing for these students. And hopefully, the 60 students that took time out of their summer to take an optional elective course learned something about personal genotypes.


Personality and Genetics

These days, it’s rare to come across a published study that describes a negative result. It’s even rarer when lots of people are interested. But keeping in the theme of missing heritability, a report in Biological Psychology was released a few days ago that found no common genetic variants significantly contributing to differences in personality. This should not be surprising for anyone who has been following the ongoing search for heritability. If height (a definitively heritable trait with an easy-to-measure quantitative output) cannot be easily explained by common variants in large studies, a search for genetic associations with less objective trait such as personality is certainly in trouble.

And that’s exactly what happened. “No SNPs reached genome wide significance (α = 7.2*10-8) and the SNP with the lowest p-value for each personality scale explains less than 0.5% of the total variance.” The usability of the psychological tests (and any personality test, for that matter) for a quantitative measure of personality is a contentious topic, discussed here and here by psychologists in the study’s wake. Boiling down an individual’s personality into a series of numbers is not a perfect science (“perfect science” is sort of an oxymoron in itself, isn’t it?) and one that would no doubt change over time (I had friends that took Myers-Briggs tests yearly to look at how the results changed). In any case, even if this sort of information could be easily ascertained, the discovery power will likely be low at current GWAS power levels. Considering personality as a “rare disease” (in one sense, every individual has a basically distinct personality, so it’s the rarest trait around), finding the (likely rare) variants that contribute to personality will indeed become a challenge.

It’s here where larger aggregated studies might come into play. Citizen science groups like DIYgenomics and research snippets/surveys in 23andWe can theoretically open the door to greater amounts of data than any single researcher can afford. Of course, such endeavors are subject to some limitations. Most importantly, traits in these citizen studies are self-reported. This means that self-perception will alter the results (whether consciously or sub-consciously). For instance, if a 23andWe research snippet asked “Do you consider yourself a good person?”, citizens’ perception of themselves will most likely steer the answer in a positive direction (although it may be an interesting experiment in genetic associations with self-perception itself, in observing the individuals who answer “No”). Personality test questions are often more advanced than this one, but may suffer the same problems.

Thanks to Razib Khan for getting this discussion going. It will be interesting to see where the investigation into the genetics of personality goes next.


Response to GAO testimony on DTC Genetic Testing

This was a guest blog post I wrote for Russ Altman over at his blog:

On July 22, 2010, the Government Accountability Office (GAO) released a testimony on Direct To Consumer (DTC) Genetic Testing companies. In the testimony, the GAO sent DNA from 5 anonymous donors to 4 anonymous DTC companies. At times, the results were astonishing, such as claims made by some of the companies taking advantage of ill-informed customers to sell custom supplements “based” on genetic test results. However, the testimony also revealed a fundamental disconnect in communication between science, medicine, and the public: a disconnect that has always existed, but is now being brought to the public eye, as recent technologies have begun to bridge the gap between scientists and consumers.

To preface, it is of course outrageous that anyone interpret a DTC genetic test as a diagnostic test (at least in their current form). Analysis of a personal genome is not a medical test. For the bulk of genetic markers, having a “high risk” allele for a disease is not even close to a diagnosis of the disease. It is simply an indicator that on average, in the particular population chosen by a research study (which are often small populations or populations selected to be enriched for a particular disease), individuals with that particular allele had a higher incidence of the disease in question than individuals without the allele (i.e. the “high risk” allele has a higher odds for the disease than the other). The companies then translate into a overall disease risk, which adjusts the prior probability of getting the disease by this odds ratio. Depending on which studies and genes/alleles a company takes into consideration, this risk may be vary considerably. In any case, the report provides a final probability of getting a disease, which may or may not actually be the same as the actual outcome. Just as an individual can get lung cancer without smoking, one can get diabetes even with a below average risk.

This is not to say all the calculations of the disease risk interpretations of all these companies are flawless (we haven’t verified the math and studies in all these companies), but the fact remains that there are legitimate scientific differences on how to interpret the data. While no particular method is outright “wrong,” there are better and worse ways to analyze results of genetic tests and competition among DTC companies for the highest quality interpretations should become increasingly important. Of course, it is objectively difficult to measure which interpretation is “best,” but this will change as more data become available both in predictive claims and possibilities for validation.

According to the testimony, the Department of Health and Human Services’ Secretary’s Advisory Committee on Genetics, Health, and Society notes that “[practitioners] cannot keep up with the pace of genetic tests and are not adequately prepared to use test information to treat patients appropriately.” While this may be true at present, this need not stop information from genetic tests from entering the clinic. A general practitioner may not be able to keep up with the latest advances in neurosurgery, but that’s where the specialist system thrives. In any case, just as clinicians are expected to demonstrate a basic level of competence in immunology in medical school, genetics must be treated the same way. Here at Stanford, a pilot project was launched to teach medical students about the field of genetic testing in an interactive classroom setting with state-of-the-art methods for analysis of personal genotypes.

Deceptive marketing, including “personalized supplements” (allegedly with celebrity endorsements) and drugs that may “repair damaged DNA” (allegedly called “epigenetics”), to say nothing of surreptitious testing and scientifically nonsensical claims, are inexcusable and irresponsible practices for any company, not limited to this particular market. However, the delicate matter of genetic testing and its use as a clinical guidance tool is a concept that must be explored further. The GAO uses the phrase “misleading test results”: it should be noted that while the current implementation of the reporting of test results may be in certain ways misleading, the framework of genetic testing is not in itself misleading. Proper interpretation is based in the same mathematical and biological context as much of today’s medicine. There is great potential for the use of genetic tests in the clinic, so long as results are carefully interpreted. While this was often limited to geneticists in the past, we hope that this can be soon accomplished by physicians and the public.


I’m here

I figured it was about that time that I start a blog and a general information page so unsuspecting tourists on the internet (whether they were looking for me or not) could find me. Well, as promised, here I am. You can find out more about me, my research, or find collaborators/old co-workers and things I think are cool.

I’ll post some interesting perspectives and news on this blog once in a while, but I can’t promise any regularity. For now, enjoy the ride.

RSS Recent Publications

  • Landscape of X chromosome inactivation across human tissues. Tukiainen T, Villani AC, Yen A, Rivas MA, Marshall JL, Satija R, Aguirre M, Gauthier L, Fleharty M, Kirby A, Cummings BB, Castel SE, Karczewski KJ, Aguet F, Byrnes A, GTEx Consortium, Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group, Statistical Methods groups—Analysis Working Group, Enhancing GTEx (eGTEx) groups, NIH Common Fund, NIH/NCI, NIH/NHGRI, NIH/NIMH, NIH/NIDA, Biospecimen Collection Source Site—NDRI, Biospecimen Collection Source Site—RPCI, Biospecimen Core Resource—VARI, Brain Bank Repository—University of Miami Brain Endowment Bank, Leidos Biomedical—Project Management, ELSI Study, Genome Browser Data Integration &Visualization—EBI, Genome Browser Data Integration &Visualization—UCSC Genomics Institute, University of California Santa Cruz, Lappalainen T, Regev A, Ardlie KG, Hacohen N, MacArthur DG


  • RT @rdhernand: Special thanks this round to Sasha Gusev, @jcbarret, @konradjk, Jeff Wall, @sbmontgom, and, yes, still my 6yo daughter Aria… 1 month ago
  • RT @KLdivergence: I'll probably end up remembering this moment as the time I decided to blow my career up for no reason, but this is someth… 1 month ago
  • RT @SharonsScience: Pet peeve - when a article describes a scientific finding but can't be bothered to provide the link to the actual paper. 1 month ago
  • RT @Newsweek: The genetic factors that determine your skin pigmentation are way more complex than scientists previously believed https://t.… 1 month ago
  • RT @genetisaur: Our work on the complexity of skin pigmentation by studying the KhoeSan and globally diverse populations is out today! http… 1 month ago