Thursday, August 16, 2018

DNA Results #2 - confirmation of... the same unexpected result?

As I mentioned before, I went and splurged on a 23andMe DNA test to correlate with the Ancestry one.   I did this because my DNA breakdown from Ancestry didn't match my expectations:
  • 75% British/Irish (the Donahue and Hall lines, plus Bradish)
  • 25% French
But the Ancestry results were:

  • 91% Ireland/Scotland/Wales (59%) and Great Britain (32%)
  • 9% everything else.
But the mapping of that 32% ALSO seems to include parts of France, particularly Normandie where I know is the origination of most of the French ancestors.


Probably not surprising (given that there's like little room for error in the DNA test itself), the 23andMe results are in line with the Ancestry ones:

  1. British and Irish:   86.5%
  2. French and German:  6.8%
  3. Broadly NW European:  5.6%
  4. everything else (European):  1.1%
So, lumping all of the non-British/Irish together, that's 13.5%, a little more than only half what I expected.

This leads into the whole "was Célina Boulé" actually an Irish adoptee hypothesis.

If she is French, then the 75/25 expected split is still there: there's just not enough English/Irish in the more distant ancestors to make up a 1/8th discrepancy.

BUT if she's Irish: then 13/16 of my 4th-generation ancestors are British/Irish = 81.25% and everything else is 18.75%.   Given that we know that SOME of the Québec/Acadian ancestors married non-French people (not many, but a few), then we START to get closer to the stated results.

The other thing that's cool about the 23andMe results is that their reporting is more in-depth:

Maternal Haplogroup:  J1c1

This mostly stems from central Europe, the Balkans and the Ukraine.   But I suppose it would extend to France to.

Paternal Haplogroup: R-S15280.1

Very, very Irish.

I'm also more Neaderthal than 68% of 23andMe customers!   Yay!

I was able to get Dad to do both Ancestry and 23andMe tests.   Results pending.   I did this because it will also help with the "is Célina Irish" test since it will let me immediately distinguish for all of the identified "distant cousins" that both sites offer which SIDE of the family they're on (if they're distant cousins of both Dad and me, then they're on the Irish side, otherwise the French/Irish side).  

Hopefully enough of these distant cousins might clump in THEIR overlaps to suggest who Célina really was.

Where things are now...

I've (finally) finished the first- and second-generation Acadiens.

Whew!

This was a lot of work, because most of my typical work flow had to be adjusted: the LaFrance doesn't have the Acadian records (since they only map Québec parishes).   Ancestry has some of them (many of them were destroyed by the British at one time or another).  While there are records for Beaubassin, Port Royal, and Grand-Pré, other locations had no records at all.

Fortunately, I found a web site with the Port Royal records neatly organized by family name and date.

But the abundance of gaps, combined with the entire population of French Acadia being dispersed in the 1750s made things hard to follow:  many went to the British colonies, others to France, and others to other French settlements: Louisiana, Québec or places like Miquelon (where I was also able to find some records).   At first I was puzzled by those who went to New England, the Carolinas, etc.:  why would you leave a British take-over of Acadia to go to another British colony.   Then I found out that most were forcibly deported TO those places by the British with the idea that the displaced families would re-integrate --- the attempt was made to send them to somewhat rural places (e.g., western Massachusetts); however, most didn't stay there and moved to the cities (which the British tried to avoid).    Things were particularly awful in the case of Québec City: an outbreak of measles became an epidemic, and hundreds of people died around 1757-1760, with entire families being wiped out.

Another situation happened while trying to determine if spouses of family members I was researching were also distant cousins (which with the Acadiens was extremely common: consanguine marriages of the 3rd and 4th degree were prevalent): in one case, the spouse ended up having a HUGE family tree archived on WikiTree: we're talking several THOUSAND people hitting pretty much every since royal family in Europe and aristocracy galose) - so that took several weeks to map.   (I would've quit, but my OCD kept me going, and I figure that it'll eventually come in handy if I ever get a breakthrough on the Irish parts of the family tree!)

So, now I'm going through all of the WikiTree entries for the direct descendants, filling in the blanks and getting a sense of what will be involved in finishing up the first- and second- generations mapping for the pre-Québec families.   That'll be the last "phase" of this project.

...  Then we start the third generation with (by my estimate) about 8-9,000 families to map.

Given that it took over three years to get this far, it might be 2021 or so before I'm finished with that.

Monday, April 23, 2018

Genomic Collapse is a Bitch

Sometimes distant cousins marry.

Sometimes not-so-distant cousins marry.

And then there's Jamie and Cersei Lannister, but that's another story.

In any case when this happens, the existence of shared ancestors begins to shrink down the number of people in the family tree, compared to what could possibly be there.

I'm defining "Genomic Collapse" as:

The ratio of unique people in a subset of the family tree to the number of "filled slots" in that subset.
 So in the extreme case of Joffrey Lanister (whose parents were brother and sister), that's 50% genome collapse at a minimum because the maternal and paternal blood lines are identical; and "at a minimum" because another consanguine relationships further up continue to contribute to the genome collapse.

So - what is it for my maternal family tree?


  • Out to 10 generations:  41.4% complete tree but with 30.2% collapse (296 people in 424 filled slots out of 2046 total slots);
  • Out to 15 generations: 5.9% complete but with 49.0% collapse (988 people in 1,937 filled slots out of 65,534 total slots);
  • Out to 20 generations: 0.2% complete but with 51.4% collapse (1,074 people in 2,208 filled slots out of 2,097,150 total slots).
I'm not sure if this is the right way to do this because one you hit a non-unique ancestor, that effect double with every generation back.

In any case, going from generation to generation it looks like this:



The blue line is the percent overlap at that generation.   The green line is the % overlap for that generation and all the ones preceding it.    The orange line is the highest possible percentage of overlap (i.e., all unknown ancestors for that generation are all repeats of known ancestors), the red line is the lowest possible percentage (all unknown ancestors are new people).



Wednesday, April 18, 2018

I think I know how to find Célina Boulé

It occurred to me last night as I was falling asleep that I might be able to use the DNA results to identify Célina Boulé's parents.

It's something of a long-shot, but given this tree:


starting with my grandmother, what we're trying to find is the set of 3rd great-grandparents that's missing.  

All of the DNA matching products try to match you up with distant cousins.   I'm slowly making my way through that list of people, looking at their family trees (when they've bothered to make one) to see if I can find common ancestors, and thereby establish our relationship.   A few have had Alexandre Guimond and Célina Boulé as the common ancestor.

But some of these supposedly "high-confidence" matches appear to have no common ancestors.  This got me thinking - what if the common ancestors are the missing parents for Célina Boulé?  And - how could we identify who they are?

What I need to do is find 4th cousins, determined genetically but NOT through a shared family tree.  Why?  Because if we can identify them through a family tree, then we know they're not Célina's parents.    What we want to find are all the 4th cousins for whom it's not possible to find a common ancestor but that we are certain is on the Québec side of the family (because otherwise they might be one of the unknown Irish or British 3rd great-grandparents).

So:
  1.  Are we genetic 4th cousins?
  2.  Do our family trees overlap in Québec?   (Or not - see below!)
  3.  Do we not see any overlap at the 5th generation?    

Note that this isn't as difficult to confirm as you might think.  There are only three family groups to consider:   a) Narcisse Guimond and Céleste Sévigny,  Basile Tousignant and Marguerite Maillot,  and Pierre Bélanger and Thérèse Maillot.  If none of these families are there, then the only one that remains are the parents of Célina Boulé.   

Nonetheless, for this to work, it still requires one other thing:  Célina MUST have a sibling and my 4th cousin must be a descendant of that sibling.   Otherwise my 4th cousin and I will have the same "hole" in the family tree with Célina being a dead-end.
So - given a set of 4th cousin candidates, under the conditions above, the intersection of all of our family trees' common ancestors should point to the parents of Célina Boulé.   This can get tricky, because it's possible that her parents are already in my family tree through some other route.   If that location in the tree is far away from the 3rd great-grandparent position (i.e., they're also C:9,4's or something) it might still work because the DNA correlation for such a distant cousin would be negligible.  

What if Célina is not Québecois?

One of the likely possibilities is that Célina is adopted and might not have a Québec background!  Instead, it's possible she's an Irish immigrant escaping the Irish potato famine.  There's no evidence to support this but the timing fits, and there was a huge influx of refugees from Ireland in 1847 (at which point Célina would be about 7 years old).  Nearly 100,000 immigrants came though quarantine at La Grosse Île, about 30 miles downstream from Québec before being relocated to Québec, Canada West, and the US:

The mortality rate from a typhus was huge: about 1 in 6 died during, or shortly after the crossing.  So, it's also possible that she was orphaned in this way.     There were programs set up for adoption, typically done by the church, where (usually older) children were placed with families.  It was more of a "foster" system than adoption; the children were seen more as convenient labor than part of the family, and typically kept their original family names.   So Célina might be one of the exceptions: if  - as I suspect - she were adopted/fostered by Moïse Boulé and Domitille Bernier (who do not appear to have had any children of their own), she took on their name (at least some of the time).

(Also, "orphan" has a slightly different meaning in the context of the situation:  some "orphans" had a living parent, but one who was not able to care for them (sickness or destitution) which makes the "adoption" more like foster care.

But her Irishness can be tested too, if there's clearly Québec integration from the late 19th century (i.e., Célina's sibling also married into a Québec family and had children) but the only common ancestors are Irish and never Québecois, then that would lend support to the "Célina is an Irish immigrant" theory.

It doesn't appear there's much in the way of records showing which children were placed with which families.  Another avenue I have not attempted is to see if there are other children that might've been adopted/fostered by Moïse Boulé.  

UPDATE: 8/16/2018

No conclusive results - yet!   But an interesting thing came out of the DNA testing.  According to both Ancestry and 23andMe, my DNA is ~85% British/Irish and ~15% everything else.   Based on the locations of ancestors in my family tree, at the 4th generation, I'd expect that to be more like 75/25, UNLESS Célina is Irish, in which case it becomes more like 81/19 which STARTS to look like the actual results.

That doesn't "prove" anything but it's one more datum that supports the hypothesis.


Sunday, April 15, 2018

Estimating variance based on family DNA

So I'm still curious about the DNA results...

I'm thinking that the lack for French ethnicity might be due to a weird roll of the dice in terms of the DNA makeup I inherited from my father and mother.   Part of this comes from comparison to a first cousin: we share the Bradish/Guimond side of things but his mother's side of his family is far removed geographically from my father's side of the family.   Yet he comes up with 11% French, whereas I come up with 0%.   Since the overlap is on my mother's side, I think I can safely preclude any "Jerry Springer"-esque explanation.  :-)

So - how to model this?   It occurred to me that getting DNA samples from my father and sister (or brother) might be enlightening.

For my father, there should be roughly 50% overlap (from a zeroth-order assumption).  So anything that doesn't overlap comes from my mother.   If the overlap is far more than 50% (and I'm not a biologist so i don't know if this is a valid hypothesis) then - borrowing from Game of Thrones - "the blood is strong" and genetically, I'm more Irish than French.   I'd expect that my father should come out almost exclusively Irish with some English.   But for my siblings, I suppose anything is possible in terms of the Irish/English/French ratios (unless my hypothesis is entirely invalid).

In the case of my siblings, they should also have roughly the same 50/50 split (as a base assumption), but with different overlaps.   Quantitatively, the difference would be some kind of estimate of the variance within one generation of "shifting" - in other words, "you have your father's nose, but your mother's eyes" might apply to one child, but the opposite for another. 

So, if there's X% variance between siblings, then that effect gets multiplied as you go back in time, complicated by the additional consanguine relationships with in-laws with the eventual "genome collapse" that happens when those family lines intersect from a shared common ancestor.  (The math for that might be completely impossible to do - but I think it would be a fun challenge!).

I also have to wonder to what extent the rabid genome researchers would love to sample the DNA from these ancestors.  Aside from the nasty thought of "grave robbing DNA", it would be extremely interesting and could solve some mysteries.   The biggest dead-end on my mother's side is her great-grandmother Célina Boulé:  was she adopted?  Who are her parents?  It's extremely likely that she is also a distant cousin from some blood line, but I think I've successfully ruled out the handful of "possible relationships" that others have used on their family trees. 

There's also the intriguing -- though I think unlikely -- possibility that she isn't Québecois at all.   Apparently there are cases of Irish children fleeing the potato famine being sent to North America.  Perhaps she's one of them - which would also inject Irish ethnicity into what otherwise would've been a nearly 100% French genome, for me only 4 generations back (so 1/16th overall but 1/4 of the Guimond line).   If it's also true that she was adopted by Moïse Boulé and Domitille Bernier it might make since since as far as I can tell they had no children of their own. 

I have two hopes:  1) that I can get some breakthroughs on my father's side of the family.  Finding a few bona-fide third or fourth cousins might open up some avenues;  2) ditto on the Guimond side, in the hopes that eventually we solve Célina's past.

Saturday, April 14, 2018

DNA results - part 2 - distant cousins

So one of the OTHER "features" of the DNA testing is that it's supposed to help you identify potential distant cousins based on likely common ancestors.

From the Ancestry.com side of things, this is something of a bust too.   At least for me.

I've got 55,000+ people on the family tree.   But there are several dead-ends - especially on the Irish side and on the Bradish line.   So I was hoping to find others who might be able to provide a breakthrough or two.

Ancestry has identifed ~250 people with whom I have a shared ancestor.   However looking at their profiles, they fall into three categories:


  1. People who took the DNA test but haven't done their family tree - no help there;
  2. People who started on their tree but it only has a few (e.g., 10) people on it - no help there;
  3. People who have fairly large family trees.  YAY...  but hey - their entries use the same format I do for names, etc...
... oh dear - their family tree is based mostly on getting data from another tree...   Mine!

Oh well  :-)

But there is one first-cousin whose tree has a different set of parents for John Patrick Bradish (the great-grandparent who was born in Punjab) and they're Irish not English.   This changes the Ethnicity assessment (skewing from mostly English to mostly Irish).   The parents I have came from Google search, are VERY low-confidence, and relied on the assumption they'd be English (since he served in the British army).   Although Ireland was part of Great Britain at the time, it just seemed to me to be unlikely many people would enlist in the British Army from Ireland.   But that might not be a valid assumption at all, and even if it were, it doesn't mean that this is an exception to the rule!

Something to follow up on!

DNA Result are in... What the hell?


So I get my Ancestry DNA results.


  • 59% Ireland/Scotland/Wales 
  • 32% Great Britain
  • 8% Other


I expected:


  • 50% English (based on the Bradish/Murphy and the Hall/Murphy lines, but see the next posting)
  • 25% Irish (Donahue line)
  • 25% Frence (Guimond line)
Where's the French?

Now all the Ancestry DNA ads have some kind of "Find Exciting Surprises" where someone takes the test thinking they're an Italian/German mix and discover they're 80% Russian and 20% Japanese or something like that.

But the largest chunk of my family tree that's mapped are Québecois - and I have HUNDREDS (probably a few thousand at this point) of original settlers who were born in FRANCE.   Now I know SOME of them started out somewhere else, went to France and from there went to Québec.   I know that some of the immigrants came from England, Ireland, Germain, Switzerland, Spain, and Italy.  But not THAT many.

I don't know what to make of this.   We've joked that there's some kind of "Jerry Springer" episode in here somewhere "you are NOT the father" - except that it would have to be "you are NOT the mother" and that would be difficult to pull off since surrogate motherhood is definitely more of a 21st century occurrence.

But there are other possibilities.   The Ethnicity estimate comes from where your current-day relatives who have taken the DNA test reside NOW.   So one might expect some degree of variance from that.  EXCEPT that one could do the appropriate weighting based upon coverage of test subjects versus actual population (and they have to do this I think, otherwise NO ONE's results would be accurate).

There's also the fact that one doesn't necessarily inherit the same degree of genetic material from EACH ancestor proportionally.   Perhaps "the blood is strong" (to quote John Arryn) is in play here, and just that my paternal contributions to my DNA overtake the maternal.   

So I think I'll splurge and get a comparison test with 23andMe.   I'm also wondering if I can convince dad or a sibling to also do the test so that we can compare.