Wednesday, December 26, 2018

ARGH! Ancestry's new feature is disappointing...

I just discovered THIS:  http://apv.ancestry.com/50558183%3A9009%3A66/overview?treeid=54485674&personid=13681670623

It has the "life story" of Célina Boulé!    Marvelous, you say!  Mystery solved.

Um, not so much.   Here it is:

When Célina Marie Boulé dit Laliberté was born on March 18, 1840, in Quebec City, Quebec, Canada, her father, Célestin, was 47, and her mother, Marie, was 44. She married ALEXANDRE GUIMOND and they had nine children together. She also had one son and three daughters from another relationship. She died on April 18, 1928, in Lotbinière, Quebec, Canada, at the age of 88, and was buried in Lotbinière, Quebec, Canada.
Here are the problems with that:

  1. We don't know when or where she was born.   It was AROUND 1840 based on her death record.   No one has ever been able to find anything that conclusively proves the March 18 date; it just gets passed around from Ancestry tree to tree.
  2. "Her father Célestin" - except that he's not her father.   People have mixed up two different Célinas; ours has no birth record, and her marriage record does not list her parents.
  3. "She also had one son and three daughters from another relationship".  No - she didn't.  That's the OTHER Célina.
This REALLY PISSES ME OFF, because it's being touted as a new "feature" of Ancestry, WITHOUT disclaimers that the data presented might entirely incorrect.

So that means that people who are doing research and don't know to dig under the surface might just "accept" it as fact. 

This completely goes against ALL the principles of genealogy.   Ancestry is being sloppy for the sake of marketing a new "feature" that has no checks and balances.

Wednesday, December 5, 2018

Updated DNA Ancestry - interesting

So, Ancestry has updated their regional reporting:

  • 79% Irish
  • 12% English
  • 9% French (they didn't have French before)
At the 4th generation back, I have:
  • 8 known Irish ancestors
  • 2 suspected Irish ancestors (based on last name)
  • 2 English ancestors
  • 3 French/Québecois ancestors
  • 1 unknown origin (Célina Boulé)
So that means:
  • 79% Irish vs.  62 1/2%
  • 12% English vs. 12 1/2%
  • 9% French vs. 18 3/4%
OK - so the English is a great match.   I don't understand the under-reporting of the French because even though a FEW Québec relatives married English or Irish immigrants, they were mostly at the distant cousin level, and not among the direct ancestors, but there are a few Europeans in there (at the 1-2% level or so).

If Célina is French/Québecois, then that would make it 25% French (observed) vs. 9% in the DNA results.  On the other hand, if she is Irish (a famine refugee) then it's 67 3/4% Irish vs. 79% which is a better (but not complete) match.

The only other possibility is that one entire sub-branch of my family (at the 5th generation) is not Québecois but is Irish.   That doesn't seem possible: the other Québecois branch has a set of first-cousins (which is another oddity, but I'll leave that aside) at the 4th generation, which ONLY leaves the Guimond/Sévigny marriage.and there's nothing odd in that family record to suggest that anyone (specifically Elusippe Guimond) was illegitimate.

So we're back to trying to discern what the range of errors are in the reporting.


Saturday, November 24, 2018

% DNA shared

When I had the opportunity to discuss genealogy with my (mostly-distant) brother, talking about some distant relative, his response to just about every distant relation was "So, no relation at all."

Of course this was idiotic for several reasons, but mostly it was annoying to me because the math says otherwise.

For direct ancestors you get one share from each parent/grandparent, so:


  1. Parent, 50% from each
  2. Grandparent, 25% from each
  3. Great-grandparent, 12 1/2% from each
and so on.  So basically 2- for each generation "up"?

But what about your ancestor's descendants? 

At first  - not thinking about it - I assumed that in the X,Y notation I use for consanguinity, it'd just be 2-(X+Y)  but that's wrong.   Why?  Because your uncles/aunt (and Nth grand uncles/aunts), don't suffer that first splitting, since your father/mother and their sibling (your aunt/uncle) also have the same share of DNA from their parents.

So instead it's 2-where Z is:


  • X,  if Y ≤ 1;
  • X + Y – 1, if Y > 1
Thus:

  • For first cousins (X = 2, Y = 2),  it's  2 –(2 + 2 – 1)  = 1/8 or, 12.5%;
  • Second cousins once-removed (4,3 or 3,4) it's 2  –(4 + 3 - 1) = 1/64 or 1.56%
OK - great.   Of course it's not ALWAYS a direct 50% contribution: there's a range: for first cousins, it's 7.3% to 13.8%.

But if you're family tree is like mine where there's lots of distant relatives marrying distant relative (though not necessarily relations to each other), how do you estimate the shared DNA to their descenants?

So, say your 5th cousin 3x removed (9,6) marries your 3rd cousin 2x removed (6,4).   From the former there's a 1/2^(-14) share and from the latter there's a 1/2^(-9) share.  I think you just add them, and divde by 2.   Or, 0.0061% + 0.1953% = 0.2014% / 2 - 0.1007% for their kid (your 4th cousin 1x removed = 6,5), who if the (9,6) weren't in the picture, would only be 0.0977%.   (Clearly this matters more the closer the relations are - it gets interesting if say, distant cousins on your father's side marries distant cousins on your mother's side.)

"But it's such a tiny number!    Yes, but consider that there are million of base pairs in your DNA.   In terms of all this genetic testing it's expressed in terms of centiMorgans (basically it's a "unit" of DNA).   Parents each contribute about ~3,400 cM, and so you can use the percentage formula to estimate the degree of overlap in DNA.

https://www.yourdnaguide.com/scp/ has a great article and table.   (They also give the range which is helpful, note that anything beyond/outside 3rd cousins (or Z = 7 above) does reach the possibility that you could be related, yet actually share NO DNA with your distant cousin, but by the same token, even eight cousins (= 17) could overlap with an average of 12 cM up as high as 50 cM.

Looking at this the other way - if you compare your DNA with someone and get an overlap you can invert the equation above to estimate the number of degrees of separation Z you have.   Comparing that to the cM Project's chart, you'll notice that there is a LOT of overlap among different distant cousins: say you find an overlap of 45 cM.    That's about Z = 7.2, where 3rd cousins are  = 7 (74 cM on average, and 4th cousins are Z = 7 (35 cM on average).    But the ranges are what's important (0–217 for 3rd cousins, 0–127 for 4th), plus all of the other cousins (3C1R, 3C2R, 4C2R, etc.) whose ranges also include 45 cM.   That's why your Ancestry DNA or 23andMe "distant relative" matches all have ranges in the predicted relationship; they're doing a comparison of shared DNA segments and comparing them to the expected ranges.

I still need to work out the math for consanguine relationships (e.g., the above example but they're also, say, 3rd cousins to each other).   I think in that case you have to follow each step along the way and apply the 50/50 mix separately (or at least do the formula above UNTIL you get to the consanguine relationship and then go step-by-step the rest of the way "down".

And, as of this morning, we're up to 67,916 people.  :-)




Thursday, August 16, 2018

DNA Results #2 - confirmation of... the same unexpected result?

As I mentioned before, I went and splurged on a 23andMe DNA test to correlate with the Ancestry one.   I did this because my DNA breakdown from Ancestry didn't match my expectations:
  • 75% British/Irish (the Donahue and Hall lines, plus Bradish)
  • 25% French
But the Ancestry results were:

  • 91% Ireland/Scotland/Wales (59%) and Great Britain (32%)
  • 9% everything else.
But the mapping of that 32% ALSO seems to include parts of France, particularly Normandie where I know is the origination of most of the French ancestors.


Probably not surprising (given that there's like little room for error in the DNA test itself), the 23andMe results are in line with the Ancestry ones:

  1. British and Irish:   86.5%
  2. French and German:  6.8%
  3. Broadly NW European:  5.6%
  4. everything else (European):  1.1%
So, lumping all of the non-British/Irish together, that's 13.5%, a little more than only half what I expected.

This leads into the whole "was Célina Boulé" actually an Irish adoptee hypothesis.

If she is French, then the 75/25 expected split is still there: there's just not enough English/Irish in the more distant ancestors to make up a 1/8th discrepancy.

BUT if she's Irish: then 13/16 of my 4th-generation ancestors are British/Irish = 81.25% and everything else is 18.75%.   Given that we know that SOME of the Québec/Acadian ancestors married non-French people (not many, but a few), then we START to get closer to the stated results.

The other thing that's cool about the 23andMe results is that their reporting is more in-depth:

Maternal Haplogroup:  J1c1

This mostly stems from central Europe, the Balkans and the Ukraine.   But I suppose it would extend to France to.

Paternal Haplogroup: R-S15280.1

Very, very Irish.

I'm also more Neaderthal than 68% of 23andMe customers!   Yay!

I was able to get Dad to do both Ancestry and 23andMe tests.   Results pending.   I did this because it will also help with the "is Célina Irish" test since it will let me immediately distinguish for all of the identified "distant cousins" that both sites offer which SIDE of the family they're on (if they're distant cousins of both Dad and me, then they're on the Irish side, otherwise the French/Irish side).  

Hopefully enough of these distant cousins might clump in THEIR overlaps to suggest who Célina really was.

Where things are now...

I've (finally) finished the first- and second-generation Acadiens.

Whew!

This was a lot of work, because most of my typical work flow had to be adjusted: the LaFrance doesn't have the Acadian records (since they only map Québec parishes).   Ancestry has some of them (many of them were destroyed by the British at one time or another).  While there are records for Beaubassin, Port Royal, and Grand-Pré, other locations had no records at all.

Fortunately, I found a web site with the Port Royal records neatly organized by family name and date.

But the abundance of gaps, combined with the entire population of French Acadia being dispersed in the 1750s made things hard to follow:  many went to the British colonies, others to France, and others to other French settlements: Louisiana, Québec or places like Miquelon (where I was also able to find some records).   At first I was puzzled by those who went to New England, the Carolinas, etc.:  why would you leave a British take-over of Acadia to go to another British colony.   Then I found out that most were forcibly deported TO those places by the British with the idea that the displaced families would re-integrate --- the attempt was made to send them to somewhat rural places (e.g., western Massachusetts); however, most didn't stay there and moved to the cities (which the British tried to avoid).    Things were particularly awful in the case of Québec City: an outbreak of measles became an epidemic, and hundreds of people died around 1757-1760, with entire families being wiped out.

Another situation happened while trying to determine if spouses of family members I was researching were also distant cousins (which with the Acadiens was extremely common: consanguine marriages of the 3rd and 4th degree were prevalent): in one case, the spouse ended up having a HUGE family tree archived on WikiTree: we're talking several THOUSAND people hitting pretty much every since royal family in Europe and aristocracy galose) - so that took several weeks to map.   (I would've quit, but my OCD kept me going, and I figure that it'll eventually come in handy if I ever get a breakthrough on the Irish parts of the family tree!)

So, now I'm going through all of the WikiTree entries for the direct descendants, filling in the blanks and getting a sense of what will be involved in finishing up the first- and second- generations mapping for the pre-Québec families.   That'll be the last "phase" of this project.

...  Then we start the third generation with (by my estimate) about 8-9,000 families to map.

Given that it took over three years to get this far, it might be 2021 or so before I'm finished with that.

Monday, April 23, 2018

Genomic Collapse is a Bitch

Sometimes distant cousins marry.

Sometimes not-so-distant cousins marry.

And then there's Jamie and Cersei Lannister, but that's another story.

In any case when this happens, the existence of shared ancestors begins to shrink down the number of people in the family tree, compared to what could possibly be there.

I'm defining "Genomic Collapse" as:

The ratio of unique people in a subset of the family tree to the number of "filled slots" in that subset.
 So in the extreme case of Joffrey Lanister (whose parents were brother and sister), that's 50% genome collapse at a minimum because the maternal and paternal blood lines are identical; and "at a minimum" because another consanguine relationships further up continue to contribute to the genome collapse.

So - what is it for my maternal family tree?


  • Out to 10 generations:  41.4% complete tree but with 30.2% collapse (296 people in 424 filled slots out of 2046 total slots);
  • Out to 15 generations: 5.9% complete but with 49.0% collapse (988 people in 1,937 filled slots out of 65,534 total slots);
  • Out to 20 generations: 0.2% complete but with 51.4% collapse (1,074 people in 2,208 filled slots out of 2,097,150 total slots).
I'm not sure if this is the right way to do this because one you hit a non-unique ancestor, that effect double with every generation back.

In any case, going from generation to generation it looks like this:



The blue line is the percent overlap at that generation.   The green line is the % overlap for that generation and all the ones preceding it.    The orange line is the highest possible percentage of overlap (i.e., all unknown ancestors for that generation are all repeats of known ancestors), the red line is the lowest possible percentage (all unknown ancestors are new people).



Wednesday, April 18, 2018

I think I know how to find Célina Boulé

It occurred to me last night as I was falling asleep that I might be able to use the DNA results to identify Célina Boulé's parents.

It's something of a long-shot, but given this tree:


starting with my grandmother, what we're trying to find is the set of 3rd great-grandparents that's missing.  

All of the DNA matching products try to match you up with distant cousins.   I'm slowly making my way through that list of people, looking at their family trees (when they've bothered to make one) to see if I can find common ancestors, and thereby establish our relationship.   A few have had Alexandre Guimond and Célina Boulé as the common ancestor.

But some of these supposedly "high-confidence" matches appear to have no common ancestors.  This got me thinking - what if the common ancestors are the missing parents for Célina Boulé?  And - how could we identify who they are?

What I need to do is find 4th cousins, determined genetically but NOT through a shared family tree.  Why?  Because if we can identify them through a family tree, then we know they're not Célina's parents.    What we want to find are all the 4th cousins for whom it's not possible to find a common ancestor but that we are certain is on the Québec side of the family (because otherwise they might be one of the unknown Irish or British 3rd great-grandparents).

So:
  1.  Are we genetic 4th cousins?
  2.  Do our family trees overlap in Québec?   (Or not - see below!)
  3.  Do we not see any overlap at the 5th generation?    

Note that this isn't as difficult to confirm as you might think.  There are only three family groups to consider:   a) Narcisse Guimond and Céleste Sévigny,  Basile Tousignant and Marguerite Maillot,  and Pierre Bélanger and Thérèse Maillot.  If none of these families are there, then the only one that remains are the parents of Célina Boulé.   

Nonetheless, for this to work, it still requires one other thing:  Célina MUST have a sibling and my 4th cousin must be a descendant of that sibling.   Otherwise my 4th cousin and I will have the same "hole" in the family tree with Célina being a dead-end.
So - given a set of 4th cousin candidates, under the conditions above, the intersection of all of our family trees' common ancestors should point to the parents of Célina Boulé.   This can get tricky, because it's possible that her parents are already in my family tree through some other route.   If that location in the tree is far away from the 3rd great-grandparent position (i.e., they're also C:9,4's or something) it might still work because the DNA correlation for such a distant cousin would be negligible.  

What if Célina is not Québecois?

One of the likely possibilities is that Célina is adopted and might not have a Québec background!  Instead, it's possible she's an Irish immigrant escaping the Irish potato famine.  There's no evidence to support this but the timing fits, and there was a huge influx of refugees from Ireland in 1847 (at which point Célina would be about 7 years old).  Nearly 100,000 immigrants came though quarantine at La Grosse Île, about 30 miles downstream from Québec before being relocated to Québec, Canada West, and the US:

The mortality rate from a typhus was huge: about 1 in 6 died during, or shortly after the crossing.  So, it's also possible that she was orphaned in this way.     There were programs set up for adoption, typically done by the church, where (usually older) children were placed with families.  It was more of a "foster" system than adoption; the children were seen more as convenient labor than part of the family, and typically kept their original family names.   So Célina might be one of the exceptions: if  - as I suspect - she were adopted/fostered by Moïse Boulé and Domitille Bernier (who do not appear to have had any children of their own), she took on their name (at least some of the time).

(Also, "orphan" has a slightly different meaning in the context of the situation:  some "orphans" had a living parent, but one who was not able to care for them (sickness or destitution) which makes the "adoption" more like foster care.

But her Irishness can be tested too, if there's clearly Québec integration from the late 19th century (i.e., Célina's sibling also married into a Québec family and had children) but the only common ancestors are Irish and never Québecois, then that would lend support to the "Célina is an Irish immigrant" theory.

It doesn't appear there's much in the way of records showing which children were placed with which families.  Another avenue I have not attempted is to see if there are other children that might've been adopted/fostered by Moïse Boulé.  

UPDATE: 8/16/2018

No conclusive results - yet!   But an interesting thing came out of the DNA testing.  According to both Ancestry and 23andMe, my DNA is ~85% British/Irish and ~15% everything else.   Based on the locations of ancestors in my family tree, at the 4th generation, I'd expect that to be more like 75/25, UNLESS Célina is Irish, in which case it becomes more like 81/19 which STARTS to look like the actual results.

That doesn't "prove" anything but it's one more datum that supports the hypothesis.


Sunday, April 15, 2018

Estimating variance based on family DNA

So I'm still curious about the DNA results...

I'm thinking that the lack for French ethnicity might be due to a weird roll of the dice in terms of the DNA makeup I inherited from my father and mother.   Part of this comes from comparison to a first cousin: we share the Bradish/Guimond side of things but his mother's side of his family is far removed geographically from my father's side of the family.   Yet he comes up with 11% French, whereas I come up with 0%.   Since the overlap is on my mother's side, I think I can safely preclude any "Jerry Springer"-esque explanation.  :-)

So - how to model this?   It occurred to me that getting DNA samples from my father and sister (or brother) might be enlightening.

For my father, there should be roughly 50% overlap (from a zeroth-order assumption).  So anything that doesn't overlap comes from my mother.   If the overlap is far more than 50% (and I'm not a biologist so i don't know if this is a valid hypothesis) then - borrowing from Game of Thrones - "the blood is strong" and genetically, I'm more Irish than French.   I'd expect that my father should come out almost exclusively Irish with some English.   But for my siblings, I suppose anything is possible in terms of the Irish/English/French ratios (unless my hypothesis is entirely invalid).

In the case of my siblings, they should also have roughly the same 50/50 split (as a base assumption), but with different overlaps.   Quantitatively, the difference would be some kind of estimate of the variance within one generation of "shifting" - in other words, "you have your father's nose, but your mother's eyes" might apply to one child, but the opposite for another. 

So, if there's X% variance between siblings, then that effect gets multiplied as you go back in time, complicated by the additional consanguine relationships with in-laws with the eventual "genome collapse" that happens when those family lines intersect from a shared common ancestor.  (The math for that might be completely impossible to do - but I think it would be a fun challenge!).

I also have to wonder to what extent the rabid genome researchers would love to sample the DNA from these ancestors.  Aside from the nasty thought of "grave robbing DNA", it would be extremely interesting and could solve some mysteries.   The biggest dead-end on my mother's side is her great-grandmother Célina Boulé:  was she adopted?  Who are her parents?  It's extremely likely that she is also a distant cousin from some blood line, but I think I've successfully ruled out the handful of "possible relationships" that others have used on their family trees. 

There's also the intriguing -- though I think unlikely -- possibility that she isn't Québecois at all.   Apparently there are cases of Irish children fleeing the potato famine being sent to North America.  Perhaps she's one of them - which would also inject Irish ethnicity into what otherwise would've been a nearly 100% French genome, for me only 4 generations back (so 1/16th overall but 1/4 of the Guimond line).   If it's also true that she was adopted by Moïse Boulé and Domitille Bernier it might make since since as far as I can tell they had no children of their own. 

I have two hopes:  1) that I can get some breakthroughs on my father's side of the family.  Finding a few bona-fide third or fourth cousins might open up some avenues;  2) ditto on the Guimond side, in the hopes that eventually we solve Célina's past.

Saturday, April 14, 2018

DNA results - part 2 - distant cousins

So one of the OTHER "features" of the DNA testing is that it's supposed to help you identify potential distant cousins based on likely common ancestors.

From the Ancestry.com side of things, this is something of a bust too.   At least for me.

I've got 55,000+ people on the family tree.   But there are several dead-ends - especially on the Irish side and on the Bradish line.   So I was hoping to find others who might be able to provide a breakthrough or two.

Ancestry has identifed ~250 people with whom I have a shared ancestor.   However looking at their profiles, they fall into three categories:


  1. People who took the DNA test but haven't done their family tree - no help there;
  2. People who started on their tree but it only has a few (e.g., 10) people on it - no help there;
  3. People who have fairly large family trees.  YAY...  but hey - their entries use the same format I do for names, etc...
... oh dear - their family tree is based mostly on getting data from another tree...   Mine!

Oh well  :-)

But there is one first-cousin whose tree has a different set of parents for John Patrick Bradish (the great-grandparent who was born in Punjab) and they're Irish not English.   This changes the Ethnicity assessment (skewing from mostly English to mostly Irish).   The parents I have came from Google search, are VERY low-confidence, and relied on the assumption they'd be English (since he served in the British army).   Although Ireland was part of Great Britain at the time, it just seemed to me to be unlikely many people would enlist in the British Army from Ireland.   But that might not be a valid assumption at all, and even if it were, it doesn't mean that this is an exception to the rule!

Something to follow up on!

DNA Result are in... What the hell?


So I get my Ancestry DNA results.


  • 59% Ireland/Scotland/Wales 
  • 32% Great Britain
  • 8% Other


I expected:


  • 50% English (based on the Bradish/Murphy and the Hall/Murphy lines, but see the next posting)
  • 25% Irish (Donahue line)
  • 25% Frence (Guimond line)
Where's the French?

Now all the Ancestry DNA ads have some kind of "Find Exciting Surprises" where someone takes the test thinking they're an Italian/German mix and discover they're 80% Russian and 20% Japanese or something like that.

But the largest chunk of my family tree that's mapped are Québecois - and I have HUNDREDS (probably a few thousand at this point) of original settlers who were born in FRANCE.   Now I know SOME of them started out somewhere else, went to France and from there went to Québec.   I know that some of the immigrants came from England, Ireland, Germain, Switzerland, Spain, and Italy.  But not THAT many.

I don't know what to make of this.   We've joked that there's some kind of "Jerry Springer" episode in here somewhere "you are NOT the father" - except that it would have to be "you are NOT the mother" and that would be difficult to pull off since surrogate motherhood is definitely more of a 21st century occurrence.

But there are other possibilities.   The Ethnicity estimate comes from where your current-day relatives who have taken the DNA test reside NOW.   So one might expect some degree of variance from that.  EXCEPT that one could do the appropriate weighting based upon coverage of test subjects versus actual population (and they have to do this I think, otherwise NO ONE's results would be accurate).

There's also the fact that one doesn't necessarily inherit the same degree of genetic material from EACH ancestor proportionally.   Perhaps "the blood is strong" (to quote John Arryn) is in play here, and just that my paternal contributions to my DNA overtake the maternal.   

So I think I'll splurge and get a comparison test with 23andMe.   I'm also wondering if I can convince dad or a sibling to also do the test so that we can compare.


Hey! Wait a minute...

So I'm knee-deep in Acadians.   The records are tricky because once the English invaded and started forcibly removing them, they moved around.   A lot.

Many escaped to Québec, others ended up back to France.   Some settled elsewhere in the Saint-Lawrence River islands:  Miquelon, Prince Edward Island, the Iles-de-Madeleine, etc., and some went to other French colonies: Louisiana, and the Caribbean (Haiti, Guadeloupe, etc.).

But many ALSO went to the (now) USA:  Boston, the Carolinas, Georgia, and so on.

This doesn't make any sense to me.   If you've just been invaded by a foreign empire, why would you travel to that empire's colonies? 

But there's a simple explanation --- tt turns out that this was actually the British government's doing; at first they "relocated" Acadians to the British Colonies, to rural parts of Massachusetts, New York, and so on.  However this cunning plan didn't work out the way they wanted:  the Acadians refused to stay and just to the cities forming Francophone communes or tried to get back to Canada (which is exactly what the British did NOT want to happen).   So, the second wave of deportations were made to France instead.

From THERE, many once again moved - this time from France either back to the St. Lawrence River settlements, or south to the Caribbean and Louisiana (which I just learned came under Spanish control in 1762 - I really need to bone up on this history).  Some even went as far as the Falkland Islands! 

I've also noticed that many died shortly after leaving Acadie.   The records in Québec are rife with burial of Acadians in 1759.   But I'm also finding similar spikes in deaths in France among the repatriated Acadians.  (Some others apparently died at sea trying to get to France.)


Sunday, March 18, 2018

Phase 1 1/2: The Acadians


So, I've got about 25-30 Acadian familes (from generations 8 to 15) to do.

Not sure how successful this will be because there isn't quite the comprehensive indexing that there is for the Québec familes: some of them are in the PRDH, but typically not in Lafrance, although many are also in the GQAF.

The canonical material is from S.A. White published in 1999 but out of print.  A revised 10-volume set is in preparation but with no projected date of release.


Saturday, March 17, 2018

So - was my estimate right?

Way back when (May 2015), I estimated the number of great-aunts/uncles and 1st cousins N times removed there'd be in the tree based upon a calculated average of 5.8 children per family.

What I came up with was 1,673 aunts/uncles (when I started the project).   It ended up being 2,184.

Why the difference?   Some thoughts:


  1. The estimate was based on my mother's side of the family only.  Now while my father's branch of the family tree is meager in comparison, it does add a few dozen people.
  2. The estimate was based only on generations 3 through 10.   I actually ended up working out to generation 13 (although far from complete). 
So - now that I've done the 1st cousins N times removed (8,245 of them) - that ends up being short of the expected 9,760.   Why is that?

Two things, I think.   First in the estimate going from direct ancestors to aunts/uncles, we KNOW that each of them were married (they're grandparents, after all) whereas there's no guarantee that every child who is an aunt/uncle will also have children (some die early, others never marry, etc.).   Another effect is that single-marriage families might have children over a period of 20 years or more, multiple marriage families have a shorter window, which means fewer children per family.

Without looking at every aunt/uncle and removing all the cases where there are no children (code which I could write I suppose), getting an accurate measurement from the aunts/uncles to 1st cousins to estimate the family size for 2nd cousins would be difficult.

In any case, the ratio of 1st cousins to aunt/uncles is 3.8 instead of 5.8 which is essentially an "effective" family size (i.e., counting non-married people in as a family size of zero).

So - I suppose to zeroth-order, the estimate number of 2nd cousins (the C:X,3's on the tree) should be about 31,000 people.  Given that I seem to be able to add about 10K people/year,  I guess we should expect the next report on this to happen in Spring 2021.   :-)


Friday, March 16, 2018

Ancestry DNA kit was sent off 2/26.   Should come back in about 3-4 weeks.

My guess - based on what I have on the tree:


  1. About 50% English
  2. About 25% French
  3. About 25% Irish
Depending on how far back it gets, I expect to see some Viking, some Native Canadian, and some other central Europe (Germany, Spain, Italy).

What would be VERY interesting would be if this in any way helps me crack some of the dead-ends.

I guess we'll see.

Well, I've reached a milestone...

I've just finished cataloging every Québec ancestor, their children, and their grandchildren from my 4th great-grandparents (generation 6) all the way to the original settlers (generation 12 or 13, depending).

That's taken about 2 years. 

The tree now has 55,182 people on it.

It's quite complete.   The only major missing piece of the puzzle the long-standing question of who Céline Boulé/Laliberté's (generation 4) parents are. 

Now what?

I can start on the Acadiens.

Or I can start cataloging the great-grandchildren of ancestors (I estimate there are about 50,000 of them).

But I think first I'll do some analysis and statistics!