Eric Turkheimer discusses a new paper examining ancestry-related genetic differences among Mexicans. The paper calculates the proportion of each individual’s DNA that comes from different ancestries — Indigenous American (IAM), European and others. (I won’t go into the details of the technique: essentially they “paint” different stretches of the genome as coming from a given ancestry.) Then the paper looks at siblings and asks, for example, “is the sibling with more IAM ancestry taller?”

This approach has two advantages. First, siblings grow up in the same family so there’s no confound of genetic ancestry with family background here; second, DNA is randomized between siblings so potentially there are no environmental confounds at all. (Or are there? See below!)

The paper finds no genetic differences related to educational attainment between people of different ancestries (but some for height and type II diabetes). But, as Professor Turkheimer puts it: “race-Twitter remains excited because they know there is more where that came from”. I think this means that he expects more studies, using the same methods, to find that ancestry differences do matter for education. To be honest, I also find that quite likely.

Confounds or mediators?

He argues that these studies are confounded:

… the most famous confound of genetic effects, Christopher Jencks’ “red hair” effect, would stroll right through this “natural experiment”. For the uninitiated, the red hair effect refers to a hypothetical world in which redheads are discriminated against, and wind up with poor educational attainment. In that world, genes for red hair look like causal EA genes. Well, genes for hair color segregate within families, so in red-hair world ginger kids would fare worse than their siblings, and the genes would look direct and causal.

The red hair example is important, and not necessarily hypothetical. For instance, there is probably discrimination on skin colour in the US. I don’t mean discrimination between black and white people: even among black people, those with darker skin have worse outcomes. Maybe skin tone is confounded with other aspects of a person, but a fairly plausible hypothesis is that darker-skinned people are discriminated against. If so, then genes for dark skin will indeed show up as causing worse outcomes. Obviously, saying that the resulting differences “are genetic” would be misleading: the problem is the discrimination!

Still, it is not a good idea to call effects like these “confounds”. To see why not, here’s another effect with a similar causal structure:

The “degree-granting” effect. Suppose a certain genetic variant causes the production of boffinite within the brain. Boffinite increases your IQ, and so siblings with the variant are much more likely to get a degree. But there’s a confound! Society makes people pass exams to get university degrees. This discriminates in favour of people with higher IQ. Without this social environment, the variant would cause no differences in educational attainment.

This ridiculous example goes to show that all genetic effects on interesting social outcomes are socially mediated. If we therefore rule them out as confounded, then we needn’t do any research — there are no unconfounded genetic effects!

The difference between the two examples is political, not scientific. We call the red hair effect “environmental” because we judge that it’s wrong to discriminate against red-haired people. We would think of the hypothetical boffinite variant as a “genetic” effect, because we think it’s right to make degrees depend on exams. If anything, in response to this result, we’d want to intervene in the biology, maybe by handing out boffinite supplements.

Suppose we change the example again: in a study run in the Jim Crow South, those with the high-boffinite variant are far more likely to vote. Now it’s an environmental effect! Under Jim Crow, IQ tests were imposed on voters (and implemented so as to discriminate against black people). It’s an environmental effect, because the appropriate response is to remove the IQ tests: degrees should depend on test performance, the ability to vote should not. We want to intervene in the social part of the causal chain.

Real world cases are likely to be harder. Suppose we find that teachers treat some kids differently, e.g. expecting more of them academically, and that mediates some genetic differences. Should we intervene? First we need to know whether the teacher is responding optimally to what they know about the child. For that, we have to evaluate a counterfactual: what will happen if they change their response to the child’s phenotype? A teacher who teaches a high-boffinite child differently may just be doing his job well. (That is, there may be gene-environment interaction: pushing a high-boffinite child academically gives different outcomes to pushing a low-boffinite child.) One who teaches a red-haired child differently is not.

In all these examples, the causal structure is the same: genes change a person’s biology, which evokes a response from the social environment. As scientists, we should be precise and call both red hair and boffinite examples of causal mediation. As policy-makers, we want to know “what are the causal mechanisms; how and where could/should we intervene?” Ignoring the genetic differences and just focusing on the environmental response, without understanding why and how it is evoked, risks misleading us.

Ancestry or ethnicity?

I have a slightly different problem with studies of “ancestry”, which again starts from the question: what are these analyses for? The paper says:

Human populations differ in phenotype means for complex traits like height and in disease prevalence for complex diseases like diabetes, but we know very little about the relative contribution of genetic and environmental factors, and their interactions, to these differences.

OK, but why do we want to learn about the contribution of genes and environment to group differences? I can think of three questions we might want to answer.

We want to find individual-level treatments that e.g. improve educational outcomes or lower the risk of diabetes. Genetic and environmental differences could each give us clues to possible treatments. We want to make individual-level predictions about diseases or other outcomes. Genes and environments could help predict an individual’s outcome; in addition, if we know that different groups have different genes/environments, we can make sharper predictions for these groups (and perhaps understand why, and when, they will work). We are interested in social-level interventions that affect group differences.

Questions 1 and 2 are the bread and butter of medical genetics. Question 3 is not, and geneticists are not always good at thinking about it. But it is obvious that there is a political context to research on group differences:

For about fifty years advanced societies, especially the US, have been deeply concerned with ethnic differences in outcomes. Liberal societies are supposed to have equality of opportunity; if you see large ethnic differences in outcomes, that suggests that opportunities aren’t equal. And of course there were very obvious unequal opportunities in the US, like segregated transport and schools. The societies made deep and serious attempts to address these differences, for example by integrating schools, by banning racial discrimination in employment, or even by enacting positive discrimination. But the differences often persisted. The working premise of these societies — at least, of the elites who made policy — was that this was evidence of more subtle environmental disadvantages. That pushed them towards more extreme and less popular interventions. And we know about the backlash.

Question 3 takes aim at the root of this dynamic because it raises the possibility that the working premise is wrong, and the interventions won’t work. This explains why research on the genetics of group differences is so controversial: it threatens a basic progressive world view, on which quite a lot of policy and social arrangements have been erected.

To give you a sense of how this plays out, when I discussed this research with one academic — a talented and principled researcher, not a careerist or political bigot — they said “if there are genetic differences, shouldn’t we just… you know… conceal them?” I was shocked, but I don’t think that is an unusual response. (Or even a crazy one! I can see the ethical reasons you might want to do this! I just don’t think they are reasons that a scientist can respect, whilst remaining a scientist.) Another example: the keynote from a very senior scientist which called research on group differences the “third rail”.

Now question 3 is about ethnic groups, not ancestries. What matters to policy is why actually existing white and black people (or mestizos and blancos, or Roma in Europe, et cetera) are different, not whether segments of DNA from different ancestries are different:

The reason to study ancestry is that directly looking at ethnic groups seems hard. For instance, famously, if you looked at a mixed sample of black and white people in the US and used genetic variants to predict their education levels, you’d find that any genetic variants which black people had more of would predict lower educational attainment. That really is confounded: all it is telling you is that black and white people differ genetically, and that black people have less education, two facts that we knew already. (This is the chopsticks gene fallacy.)

Ancestry seems to offer a way round this, because even different siblings in the same family can have different amounts of ancestry from different groups. So you can get unconfounded, or at least less confounded, estimates! OK, but estimates of what? I think there is a danger that we are substituting an easy-to-answer question for a hard question, and kidding ourselves that they are the same.

Another less-good reason to talk about ancestry is that geneticists would like to discuss these topics without mentioning the word “race”, which has an aura of being unscientific and, well, racist. But this doesn’t really work as a communication strategy, because laypeople are swiftly going to understand that people from different ethnic groups have very different ancestries on average. I am also not convinced that ancestry is actually more scientific than ethnicity. As a social scientist, I’d insist that ethnicity, even though it’s a social construct, is real and important (just like money); on the other hand, ancestry is itself a loose statistical construct, a way of guesstimating where somebody’s great-grandparents came from, with a great deal of model uncertainty hidden behind the numbers. No?

If you want to use ancestry to talk about ethnicity, you can. You can think of a black person as having X% African ancestry, a white person having less, and multiply the difference by your estimated effect size from the within-siblings analysis of ancestry. But there are risks in that!

To give one example, the distribution of DNA from each ancestry may not be the same across ethnic groups. The white-ancestry DNA segments of black people, for instance, may not be the same as the white-ancestry DNA segments of white people.

Here’s an extreme example of how that might pan out. Suppose there are two ethnic groups, and a single allele which affects an outcome, with “High” and “Low” variants. The ethnic groups start out with the same proportions of the allele. But there’s discrimination in the marriage market: there is an endogamous “elite”, consisting of people belonging to the privileged ethnicity and having the High allele. So “elites” only marry other elites, and non-elites only marry non-elites. Also, there’s a “one drop” rule: anyone from an ethnically mixed marriage counts as from the unprivileged group.

After one generation, the unprivileged group will have more of the Low variants than the privileged group, because some of their parents will be non-elite privileged-but-Low people. But if you look within the unprivileged group, the Low variant will be associated with privileged ancestry!

I simulated this with 1000 parents from each group, of whom 250 had the High allele. Among the children, here’s the table of alleles by ethnicity:

Allele (%)

Group H L

1 44.2 55.8

2 16.4 83.6

People from the unprivileged group 2 are more likely to have the Low allele. But among group 2, here’s the table of ancestry by allele:

GROUP 2:

Allele (%)

Ancestry H L

1 0.0 100.0

2 23.8 76.2

I assumed that ancestry is measured directly “around” the relevant allele, so if someone inherited the low allele from one parent, they also inherited that parent’s ancestry. Everyone with ancestry from group 1 then has the low allele; this would hold even across siblings.

Here, the question about ancestry will give the opposite answer to the question about ethnicity! Among group 2, group 1 ancestry is significantly associated with the Low allele — but group 2 has more of the Low allele than group 1. If you look at both groups together, you get a null result:

BOTH GROUPS:

Allele (%)

Ancestry H L

1 24.8 75.2

2 23.8 76.2

There is no association between ancestry and allele in the whole population — by assumption, both groups started with the same proportion of the allele! But again, this null result is misleading with respect to the actual ethnic groups.

This is a hyper-crude example to make a point, but like the red hair effect, it is not crazy to think of similar mechanisms in the real world. For instance, in Latin America, there have long been relatively high rates of interethnic mating, combined with skin colour as a status marker.

(Results can also be misleading the other way. If only high-allele members of group 1 mate with group 2, then group 2 will have more high alleles, but within group 2, those high alleles will come more from group 1. African-Americans carry significant European ancestry: we don’t know a ton about how white ancestors of black people differed from the broader population with respect to, say, education or socio-economic status, but one possibility is that male slaveowners were especially likely to have children with slaves.)

I don’t claim the paper authors aren’t aware of these issues. They say:

…direct ancestry effect estimates from admixed families may differ from the genetic contribution to phenotype differences between more homogeneous samples. Systematic differences could be induced by assortative mating with respect to both phenotype and ancestry and if effects of alleles depend on genetic and environmental background (GxE and GxG interactions).

So I want to understand, what’s the ultimate goal of the estimation here?

Ancestry or causal variants?

In some sense, looking at the “causal effects of ancestry” is a slightly weird project. Effectively, one paints different parts of the genome as (e.g.) African-American or European by ancestry, and then tries to allocate outcomes to those segments. Is “black DNA” to blame? Or is it “white DNA”? Of course, DNA knows no white or black, and comes only in the letters A, T, C and G. The effects that you estimate are going to be weighted sums of actual effects of causal variants (plus interactions). They don’t seem likely to directly help with questions 1 and 2 above — finding individual-level treatments or making individual-level predictions. It’s like a study of “the effect of the Catholic Church”: er, could you be more specific?

More broadly: years of talking against genetic essentialism, and now we have a tool that goes “your DNA is X per cent African”! It’s the same as those fun popular know-your-ancestry DNA tests. But using the percentages as an independent variable in a scientific study… eeh. geneticists have tried very hard to talk about “ancestry” rather than “race”, but in practice it’s easy to make the slippage. So laypeople are likely to think “oh, I’m really 15% European”. And researchers who find no differences between ancestries will be tempted to present this as shining proof of no differences between races. (And some may do the corresponding thing if they find differences; depending on their ideology.)

But it’s not that! It’s just a substitute question that we found easier to answer.

Maybe I’m stupid, but it seems to me that there is a simpler and more direct research avenue:

Estimate the causal effects of many individual variants among a given ethnic group. Run a counterfactual where the ethnic group’s distribution of those variants matches a different ethnic group.

I like this approach for three reasons. First, a lot of effort has been expended in doing part 1 already! In particular, within-family designs are quite good at estimating effects of individual variants, and geneticists are dedicating resources to creating larger family and sibling samples for this purpose. Why not leverage all that effort?

Second, this approach actually can help with questions 1 and 2 above. If you find that Type II Diabetes is more prevalent in indigenous Americans “because of their genetic ancestry”, you don’t know how to treat that. If you find it is more prevalent “because they have more of these specific genetic variants”, this opens the door to further research into causal mechanisms and potential treatments.

Lastly, this approach addresses Reason 3 directly. It estimates the differences between real ethnic groups today, rather than “ancestries” with a probabilistic relationship to those groups. And, again because it looks at specific variants, rather than those variants aggregated into “DNA ancestry”, it makes it easier to research the environmental mechanisms which mediate any causally relevant genetic differences. As I said above, that’s a prerequisite for any informed policy response.

There is an argument against this approach, which is that much of our current genetic data only has common variants (single nucleotide polymorphisms) rather than whole-genome records. The measured “effects” of these variants typically include the effects of all the unmeasured variants which they correlate with (are in linkage disequilibrium with, in the jargon). But since these correlations can be different between ethnic groups, it makes it hard to estimate the counterfactual. (If, counterfactually, one group’s distribution of measured variants changed to match a different group, how much would the distribution of the true causal variants be changed?) It’s a fair point. But first, there are statistical ways to tackle this, and second, we are getting ever more whole-genome data. So, I think this approach will work better in the long run.

Code for the simulation.

