The Case of the Cryptic Chromosome

Where are the Hardy Boys when you need them?

I’ve been doing a lot of work recently to associate my DNA matches with specific segments of the various chromosomes. I think this will prove to be a valuable project in identifying some pre-James Sr. ancestors (and maybe some unknown later relatives). In addition to the various sites which provide DNA testing and matches to DNA relatives (Ancestry, FamilyTreeDNA, MyHeritage, 23andme, and GEDMatch are the ones I use), there is an excellent free program called GenomeMatePro which imports and organizes all DNA matches and associated chromosome data. I use it as a master database to view all the matches together in one place, as well as to tie together the chromosome segment data from the different sites. Contact me directly if you are interested in using it yourself – it’s a very large and complex program but once you get the hang of it and what it can do it becomes the Swiss Army Knife of DNA genealogical research (and I’m talking about one of the big honkin’ knifes, not the little pocket babies).

I’ve also been using an online tool called DNA Painter to place various segments on colorized sections of DNA for each chromosome. It is a great visualization tool that help to see where descendants of various ancestors match up. Below is a picture of what I am currently working with on the mysterious (as you will see) Chromosome 3 (I have removed last names to preserve privacy, although almost all of these names are public on the DNA matching sites).

What makes this particular chromosome interesting (and cryptic!) is that the area in the right half contains known descendants of James Sr. or one of his progeny, or people with a different Wilson name, or folks that are descendants of known associated families and regions. For instance, Debbie R. (in pink) is descended from Thomas G. Wilson’s and Sally Duck’s daughter Lovisa Jane and Nathan M. from James Wilson and Delilah’s daughter Elizabeth. There are many other known Wilson relatives I haven’t yet included as they don’t really provide any new information.

OK so now, as promised, the mysteries:

  1. Carolyn D has quite a bit of DNA overlap with me – 3 segments that total about 50cM (centiMorgans) on this Chromosome alone, which is enough to make her a 3rd or 4th cousin on average. To compare, a 3rd cousin would have a common g-g-grandfather of James C. Wilson of Kansas, and a 4th would be in common with his father Thomas G. Wilson. This timeline would put our common relative somewhere between the very late 1700’s and the mid 1800’s, based on probabilities. However, her family tree has *no* Wilsons in it! But it does have connections in the Henry County area (including neighboring North Carolina) up through her recent ancestors. One name that stands out is “Gilley” – two of Thomas Wilson’s (son of James Sr.) daughters (Polly first, then Libby after Polly died in 1813) were married to George Gilley. George was born in 1765 in Buckingham County, Virginia, where I have no record of any our Wilson relatives, so it is unlikely he was actually a Wilson. If there were any Wilson (or Gilley’s descended from Wilson) sources of Non-Parental-Events (known as NPE’s, also more accurately called Not Parent Expected), it would have most probably been during the 1800’s in the Henry County area. Many Wilsons (and the families of female Wilsons) stayed behind as others in the clan headed off to Kentucky, Tennessee, Illinois, etc. Of course, the other possibility is that one of my female ancestors, such as Delilah the wife of James II (grandson of James Sr.) was of the Gilley or some other family in Henry County. But for reasons mentioned later I believe this segment is truly Wilson DNA from long before.
  2. Another fairly close match on Chr3 is Beverly D. She has also ancestors with familiar names in Henry County, VA, such as Stephens and Bailey (both families intermarried with Wilsons at various times). Some of the Stephens also moved to the Pulaski County, KY area along with or at about the same time as the Wilsons. For the same reasons as with Carolyn D. above, I believe there was an NPE in Kentucky in the 1800’s involving Wilson DNA. In both cases, we would really need Y-DNA evidence from male descendants of these families to know much more.
  3. Now it gets more interesting (or at least not as sordid :-)) — two or three segments — Rhonda H., RK F., and Karen J. (which might only be 2 people) actually do have solid Wilson lines in their tree, all traced back to a James Edward Wilson, b. 1830 in Tennessee, who was the son of Edward b. 1795 in Virginia. There is no indication who his father was or in what part of Virginia he was born. The tax lists for Henry County and Franklin County do have an Edward Wilson in 1782-1785, 1795, and as late as 1803 (not all areas have complete lists for all years). It obviously isn’t a given that Edward born in 1795 would be the son of another Edward, but it’s a good place to start looking. Unfortunately, the Edward in Henry/Franklin isn’t geographically adjacent to our known Wilsons, although he is close to other Wilsons. If this Edward is truly the progenitor of this group on Chr3, then it is likely the other Wilsons in that area are related to us also. But they ended up in different parts of the old Henry County — did they migrate together and split up, or were they just distant relatives already? Once again, more Y-DNA testing might reveal answers in the years ahead.
  4. Finally, we have the most interesting clue in our cryptic chromosome, which is thanks to Ralette B (who also gets today’s prize for the most interesting relative name). Ralette’s family tree has a clear and documented Wilson line going back to Lanarkshire, Scotland, with a John Wilson immigrating to Maryland in the late 1700’s. The level of DNA matching suggests a 4th cousin or so, but our common ancestor would have to be at least 1 generation prior to James Sr. given the relatively late date of Ralette’s ancestor’s arrival in the US. I will be writing up a separate blog post on this line and where there or may or may not be a match. But it throws a monkey wrench into my theory that James or his family emigrated from Ulster (see my previous post on Y-DNA results).

So to sum up our case: the DNA segment on Chromosome 3 that is definitively correlated to known descendants of James II (grandson of James Sr.) and Thomas G. Wilson (James II’s son) also match with descendants of related families in Henry County, Virginia and Pulaski County, Kentucky, suggesting a NPE (“Not Parent Expected”) from these areas, which will probably not be resolved until there is more DNA testing of those groups. We also have a new link to a relatively (!) unknown Edward Wilson who could be a product of the Henry County clan, or perhaps a related group in Virginia. And finally, for the first time we have a solid, documented connection to a Wilson directly from Scotland!

But unfortunately, we aren’t yet able to close the Case of the Cryptic Chromosome. Maybe Dad will take us to the Bayport Dairy Queen for milkshakes anyway…*

* Lame Hardy Boys reference

Y-DNA Results

While my Y-DNA test results have not yet proven particularly useful in identifying James Sr.’s ancestry, they have shown some interesting connections that may still end up being informative.

First a little DNA background is in order. There is no way I can do justice to the entire topic of DNA and DNA testing, but I will try to give just enough to help the reader understand my interpretation of my results. There are many sources of more detailed information on the internet and in books – one good place is https://learn.familytreedna.com/dna-basics/ydna/ .

The Y chromosome is one of the 2 sex chromosomes, the other being the X. Each person inherits their X chromosome mother (one of hers), and either their father’s Y (for males) or X (for females – the other X). A man thus has both X and Y chromosomes, and a woman two X’s (but no Y). So, the Y chromosome represents the unbroken male lineage of every man, back to the first “Adam” from which we are all descended. However, over time the Y-chromosome mutates slowly, and at a different rate for each section of the chromosome. As a result we all end up with completely different DNA sequences on our Y, from which we can derive information about our ancestry and relatives.

There are two types of Y mutations that are typically tested for – Single Tandem Repeats (STR’s) and Single Nucleotide Polymorphisms (SNP’s, referred to as “snips”). STR’s are basically markers where the genetic code repeats a certain number of times in a fairly well-known fashion — the mutation is in the number of times the code repeats (e.g. from 22 to 23). FTDNA will test anywhere from 12-700 of these known markers, and report a Genetic Distance (GD) to other testers. The GD represents the number of differences in STR counts across all tested STR’s. So, a GD of 3 could be a difference of 3 repeats in one marker, or a difference of 2 in one marker and 1 in another. To find a match within even about 8-10 generations requires at least 67 markers tested (Y-67). For example, a known 3rd cousin of mine (with a common great-great-grandfather, James C. Wilson of Kansas) has a GD of 0 at 67 markers, whereas cousins with whom I believe to have a common ancestor of James Sr. or one of his sons have a GD of 3 on a Y-67 test, and a GD of 4 on a Y-111 test. At the other end of the scale I have over 3500 matches with a GD of 0 for Y-12, most of whom do not share the Wilson surname — we are probably related back many hundreds or thousands of years before surnames were in use (or perhaps because of a few Wilson roosters in other surname hen houses along the way)! I’ll discuss some of the interesting matches later, but for genealogical purposes most of these low STR matches are useless.

The other test done by FTDNA (and others) is the SNP test, called most recently the Big-Y. This test examines thousands of points along the Y chromosome, and reports where the tested chromosome deviates from a “reference” in being positive for a particular mutation of the 4 building blocks of DNA – A, G, T, C. The combination of all these mutations across the thousands of tested spots yields a specific “haplogroup” that each male belongs to. Depending on how many other men have been tested, a haplogroup could be unique to an individual, or shared with thousands or millions of other men. These mutations could conceivably (LOL) happen every generation, giving a man a different haplogroup than his father, but on average they occur about every 80-150 years (depending on the SNP), so man is typically in the same possible haplogroup with men of a shared ancestor about 5-10 generations ago. However, any particular “terminal” haplogroup may not have been discovered depending on how many of his “close” relatives have been tested. It takes 2 men testing positive for a particular mutation for it to get a “name” and be placed in the tree. If no other men on a branch have tested positive for these mutations, they are called “private variants” and just given a number. In my case, I have 21 private variants which means that no other man on my branch of the tree within about 21*80-21*150 years (or about 1700-3200 years) has been tested. So, if nothing else I know that no other descendant of James Sr. has been tested! Actually, there is 1 other I know of, but he was tested by a different company and our results haven’t been combined. If and when they are, I expect each of us to have only 2-3 variants since our likely common ancestor is James Sr. who was born about 250 years before us.

I am going to work up some charts to explain this better, and to give some rationale for further testing by descendants of James Sr. in a subsequent post. But for now, let me share some interesting findings.

As I mentioned, I have a known 3rd cousin who has tested at Y-67 with a GD of 0 (as a reminder, these are tests of STR’s and the above discussion of SNP mutation rates does not apply here – STR mutation rates vary greatly and are much more complex to calculate). On Y-67 tests, there are also a couple of descendants of John “Culpeper” Wilson at a GD of 3. As discussed in other posts, this John Wilson is likely a grandson of James Sr. so our common ancestor would be about 8 generations ago. One of these gentlemen has also tested at Y-111 with a GD of 4 to me. Both of these results are consistent with a probability of greater than 60% (using FTDNA’s calculator) that our common ancestor is James Sr., although it could be earlier than that (which I find unlikely based on other research, but it is something to keep in mind).

One other tester with a GD of 5 at Y-67 is also believed to a direct descendant of James Sr. In this case, the probability of that drops to just over 50% but is still quite likely. However, once again maybe there is another common ancestor in Henry County at that time we just haven’t identified yet.

A much stranger result is the presence of two men, once again at GD of 3-4 on Y-67, with last names of Brabazon and O.’Brollaghain, who both live in Ireland! Some correspondence with another, unrelated, Brabazon in Ireland has yielded the theory that a Wilson male in the Ulster region in the 1600’s or 1700’s took the name Brabazon (of which O.’Brollaghain is a derivative – go figure!), possibly due to marrying into a wealthy family of that name which was not uncommon at the time. Of course, there is also the possibility of a direct adoption or a child born of an infidelity.

The fact that the Brabazons are associated with the Ulster region, and still are intermingled there with Wilsons today, suggests our James or his ancestors may have come from Ulster. There definitely was a branch of Scottish Wilsons that migrated to Ulster. The challenge is in figuring out how far back our common ancestor is — it clearly is at least 1 generation before James Sr., and based on FTDNA’s calculator could easily be 4 more generations with high probability. The branch could even have happened in Scotland before the migration to Ireland, which is looking at least as likely with some more recent autosomal DNA matches I have found. Notwithstanding the new mystery provided by this match, it adds an intriguing bit of history to our line.

A footnote: speaking of roosters being in hen houses they oughtn’t be, a couple of years I talked with a gentleman in Tennessee who was a GD of 0 to me at Y-37. Both his father and grandfather were born in Pulaski County, Kentucky, which was the area many of our Wilsons migrated to from Henry County, Virginia, and where there are still many, many Wilson descendants. This man’s surname is completely different, which is somewhat unusual from such a close connection. Both his father and grandfather had multiple marriages and children late in life, such that the grandfather’s birthdate was around 1830. Since the g-grandfather of this man had migrated to Pulaski County around 1800 (from a completely different area than our Wilsons), either the father or the grandfather of my match must have been a Wilson! We can’t really tell which one (maybe someday testing more descendants of each will narrow it down), but both families (Wilson and the other surname) also intermarried several times as did many other families in the area, so it isn’t surprising there was lots of familiarity and interactions between the two (obviously some was very close!). It just goes to show what interesting finds may come from DNA testing – people need to be prepared for discovering disturbing skeletons in the family tree.