Natural selection: what's the effect size?

Apr 7, 2023

A work in progress, a.k.a. a cry for help

6 Comments

Apr 9, 2023

"...it would explain about 40% of variation in actual education. But our existing, noisy polygenic score only explained about 4.5% of the variation in education among our sample. The ratio between these two is 40/4.5 = 8.9."

Don't you need to square root those numbers to get the effect size ratio?

Expand full comment

Reply (2)

David Hugh-Jones

Apr 9, 2023

This worried me so I went back and wrote a little R example:

n <- 1e5

x_real <- rnorm(n)

y <- 1 * x_real + rnorm(n)

# equal standard error of x_real and x error

eta <- rnorm(n)

x_observed <- x_real + eta

# R2 about 50%

summary(lm(y ~ x_real))$r.squared

[1] 0.4977077

# R2 about 25% = 50% * s2_x/(s2_x + s2_eta)

> summary(lm(y ~ x_observed))$r.squared

[1] 0.2468454

# about 1

coef(lm(y~x_real))["x_real"]

x_real

0.9995164

# about 1/2 = 1 * ratio of R2 = 1 * ratio of variances

coef(lm(y~x_observed))["x_observed"]

x_observed

0.4979979

You can play with the ratios to persuade yourself.

Now I have spilled coffee on myself. I hope you're happy.

Expand full comment

Reply (1)

Guy

Apr 9, 2023Edited

I may not really know any math, but since you spilled coffee on yourself and we don't know the relative g-loading of those two tasks as far as we know we're both equally advanced intellectually.

Another thing I'm thinking about is that even if there was no selection at all on PGS for intelligence, wouldn't that imply that new mutations that decrease intelligence are not being selected against except to the degree they're harmful in other ways, which was presumably not the case back when a certain level of intelligence was sorely needed to feed your family? So that's one way your estimate could be an underestimate.

Expand full comment

Reply (2)

David Hugh-Jones

May 16, 2023

Hey, no, you were right, I think. As a first year PhD student pointed out to me in Bologna, I wasn't correcting for the fact that my `x_observed` variable now no longer is standard normal. So our real "x_observed" is more like:

x_observed = (x_real + eta)/sqrt(var(x_real+eta))

and correcting for this makes your initial answer correct.

Expand full comment

David Hugh-Jones

Apr 9, 2023

Well, we just don’t know what happens for really rare mutations. I think Michael Nivard has done some work on these, but I don’t think he looked at their relationship to fertility.

Expand full comment

David Hugh-Jones

Apr 9, 2023

I think not, actually. That was my intuition too. But see https://stats.stackexchange.com/questions/238878/how-do-errors

-in-variables-affect-the-r2.

Expand full comment

Wyclif's Dust

Natural selection: what's the effect size?