"...it would explain about 40% of variation in actual education. But our existing, noisy polygenic score only explained about 4.5% of the variation in education among our sample. The ratio between these two is 40/4.5 = 8.9."
Don't you need to square root those numbers to get the effect size ratio?
I may not really know any math, but since you spilled coffee on yourself and we don't know the relative g-loading of those two tasks as far as we know we're both equally advanced intellectually.
Another thing I'm thinking about is that even if there was no selection at all on PGS for intelligence, wouldn't that imply that new mutations that decrease intelligence are not being selected against except to the degree they're harmful in other ways, which was presumably not the case back when a certain level of intelligence was sorely needed to feed your family? So that's one way your estimate could be an underestimate.
Hey, no, you were right, I think. As a first year PhD student pointed out to me in Bologna, I wasn't correcting for the fact that my `x_observed` variable now no longer is standard normal. So our real "x_observed" is more like:
x_observed = (x_real + eta)/sqrt(var(x_real+eta))
and correcting for this makes your initial answer correct.
Well, we just don’t know what happens for really rare mutations. I think Michael Nivard has done some work on these, but I don’t think he looked at their relationship to fertility.
"...it would explain about 40% of variation in actual education. But our existing, noisy polygenic score only explained about 4.5% of the variation in education among our sample. The ratio between these two is 40/4.5 = 8.9."
Don't you need to square root those numbers to get the effect size ratio?
This worried me so I went back and wrote a little R example:
n <- 1e5
x_real <- rnorm(n)
y <- 1 * x_real + rnorm(n)
# equal standard error of x_real and x error
eta <- rnorm(n)
x_observed <- x_real + eta
# R2 about 50%
summary(lm(y ~ x_real))$r.squared
[1] 0.4977077
# R2 about 25% = 50% * s2_x/(s2_x + s2_eta)
> summary(lm(y ~ x_observed))$r.squared
[1] 0.2468454
# about 1
coef(lm(y~x_real))["x_real"]
x_real
0.9995164
# about 1/2 = 1 * ratio of R2 = 1 * ratio of variances
coef(lm(y~x_observed))["x_observed"]
x_observed
0.4979979
You can play with the ratios to persuade yourself.
Now I have spilled coffee on myself. I hope you're happy.
I may not really know any math, but since you spilled coffee on yourself and we don't know the relative g-loading of those two tasks as far as we know we're both equally advanced intellectually.
Another thing I'm thinking about is that even if there was no selection at all on PGS for intelligence, wouldn't that imply that new mutations that decrease intelligence are not being selected against except to the degree they're harmful in other ways, which was presumably not the case back when a certain level of intelligence was sorely needed to feed your family? So that's one way your estimate could be an underestimate.
Hey, no, you were right, I think. As a first year PhD student pointed out to me in Bologna, I wasn't correcting for the fact that my `x_observed` variable now no longer is standard normal. So our real "x_observed" is more like:
x_observed = (x_real + eta)/sqrt(var(x_real+eta))
and correcting for this makes your initial answer correct.
Well, we just don’t know what happens for really rare mutations. I think Michael Nivard has done some work on these, but I don’t think he looked at their relationship to fertility.
I think not, actually. That was my intuition too. But see https://stats.stackexchange.com/questions/238878/how-do-errors
-in-variables-affect-the-r2.