When I had the opportunity to discuss genealogy with my (mostly-distant) brother, talking about some distant relative, his response to just about every distant relation was "So, no relation at all."
Of course this was idiotic for several reasons, but mostly it was annoying to me because the math says otherwise.
For direct ancestors you get one share from each parent/grandparent, so:
- Parent, 50% from each
- Grandparent, 25% from each
- Great-grandparent, 12 1/2% from each
and so on. So basically 2
-N for each generation "up"?
But what about your ancestor's descendants?
At first - not thinking about it - I assumed that in the
X,Y notation I use for consanguinity, it'd just be 2
-(X+Y) but that's wrong. Why? Because your uncles/aunt (and
Nth grand uncles/aunts), don't suffer that first splitting, since your father/mother and their sibling (your aunt/uncle) also have the same share of DNA from their parents.
So instead it's 2
-Z where
Z is:
- X, if Y ≤ 1;
- X + Y – 1, if Y > 1
Thus:
- For first cousins (X = 2, Y = 2), it's 2 –(2 + 2 – 1) = 1/8 or, 12.5%;
- Second cousins once-removed (4,3 or 3,4) it's 2 –(4 + 3 - 1) = 1/64 or 1.56%
OK - great. Of course it's not ALWAYS a direct 50% contribution: there's a range: for first cousins, it's 7.3% to 13.8%.
But if you're family tree is like mine where there's lots of distant relatives marrying distant relative (though not necessarily relations to each other), how do you estimate the shared DNA to their descenants?
So, say your 5th cousin 3x removed (9,6) marries your 3rd cousin 2x removed (6,4). From the former there's a 1/2^(-14) share and from the latter there's a 1/2^(-9) share. I think you just add them, and divde by 2. Or, 0.0061% + 0.1953% = 0.2014% / 2 - 0.1007% for their kid (your 4th cousin 1x removed = 6,5), who if the (9,6) weren't in the picture, would only be 0.0977%. (Clearly this matters more the closer the relations are - it gets interesting if say, distant cousins on your father's side marries distant cousins on your mother's side.)
"But it's such a tiny number! Yes, but consider that there are million of base pairs in your DNA. In terms of all this genetic testing it's expressed in terms of centiMorgans (basically it's a "unit" of DNA). Parents each contribute about ~3,400 cM, and so you can use the percentage formula to estimate the degree of overlap in DNA.
https://www.yourdnaguide.com/scp/ has a great article and table. (They also give the range which is helpful, note that anything beyond/outside 3rd cousins (or
Z = 7 above) does reach the possibility that you could be related, yet actually share NO DNA with your distant cousin, but by the same token, even eight cousins (
Z = 17) could overlap with an average of 12 cM up as high as 50 cM.
Looking at this the other way - if you compare your DNA with someone and get an overlap you can invert the equation above to estimate the number of degrees of separation Z you have. Comparing that to the cM Project's chart, you'll notice that there is a LOT of overlap among different distant cousins: say you find an overlap of 45 cM. That's about
Z = 7.2, where 3rd cousins are
Z = 7 (74 cM on average, and 4th cousins are
Z = 7 (35 cM on average). But the ranges are what's important (0–217 for 3rd cousins, 0–127 for 4th), plus all of the other cousins (3C1R, 3C2R, 4C2R, etc.) whose ranges also include 45 cM. That's why your Ancestry DNA or 23andMe "distant relative" matches all have ranges in the predicted relationship; they're doing a comparison of shared DNA segments and comparing them to the expected ranges.
I still need to work out the math for consanguine relationships (e.g., the above example but they're also, say, 3rd cousins to each other). I think in that case you have to follow each step along the way and apply the 50/50 mix separately (or at least do the formula above UNTIL you get to the consanguine relationship and then go step-by-step the rest of the way "down".
And, as of this morning, we're up to 67,916 people. :-)