Thursday, May 21, 2015

Geeking out on the data!

After the last post, it occurred to me that the counts of direct ancestors and the numbers of (Nth great) grand aunts/uncles could be used to estimate the average number of child you could expect in each family:

Y = 2(G-G0)α

where Y is the observed number of kids for each generation and α is the average number of kids per family while the assumption that it's constant across generations. Since we're dealing with a comparatively closed population that is also overwhelmingly agrarian over 350 of the last 400 years, that's probably OK.

Here's the fun geeky part. In logarithmic space, that's:

log Y = (G-G0) log 2 + log α

So borrowing the data from the last posting for Generations 3 through 10 (so G0 = 3), we can get the average for log α (0.766) and for extra geekiness, the distribution about that mean to get the standard deviation (0.143) which corresponds to:

α = 5.8 ± 1.9

So, on average 4 to 8 children in each family!

