Welcome Cousin! Are We Related? How So?


Welcome! If I sent you a link to read this page, you have contacted me to explore our common genealogy. Either you have an autosomal DNA (atDNA) match to one of the DNA test kits that I manage, or you might found a common ancestor name in one of the family trees that I have published online.

So here are the possible scenarios:

  1. We are blood relatives, sharing a common ancestor within “genealogical time” (about 10 generations into the past)
  2. We are very distantly related
  3. We have no common ancestry at all

I created this page as a kind of “canned response” to new inquiries. I don’t want to sound rude, dismissive or condescending. What I have found, over the years, is that my first response to an inquiry requires that I explain a great deal about how genetic genealogy works. Although Ancestry (or MyHeritage / FamilyTree / 23andMe) says that we are 2nd or 3rd cousins does not mean that is a fact. That is a statistical approximation based on minimal analysis. We will not successfully discover what our common ancestors are unless we both have good family trees going back three or four generations to compare. I am posting all this on a web page so that we can begin a productive correspondence.

If you are relatively new to genealogy, especially DNA genealogy, please read through this carefully. It is not a substitute for some of the better guides that are on the Internet … just the beginning of a trail to follow.

If you are an experienced genealogist who is familiar with DNA genealogy, you know exactly why this page exists. The information that you need (published family trees and DNA kit information) is on this page.

Genealogy Exploding Head Syndrome

I have a problem. I have been contacted by hundreds of people who were told by one of the DNA testing services that we have matching DNA and that we are somehow related. My backlog of emails that I have not answered at all is close to 100 people deep and some are almost three years old. This is because when I do respond, I spend a great deal of time explaining how genetic genealogy works, how it doesn’t work, and what needs to be done to further family research using it.

Many of the messages I receive are very inviting and polite, like: “Ancestry says that we are 2nd to 3rd cousins … can we explore this? Here is a list of my family names and locations.” Sometimes it is a query about one of my published family trees, like “Do you have any of these individual names in your family tree?” or “My great-grandfather comes from Czechoslovakia … do you have any information about this family name from there?”

Genealogy is complicated, tedious work. There are no “easy answers” or “quick fixes” to the fact that many of us did not inherit an extensive family tree from our parents or grandparents, if they are still alive. It is very rare that you will stumble upon a close relative whom you have never heard of before. It does happen. I have found at least one branch of my family that we did not think survived the Holocaust … neither did they. DNA confirmed this, but it took more than DNA matching to figure out how we were related.

There is a corresponding problem. When I try to contact my matches, after I have done some preliminary analysis of DNA segments and family trees, I never hear back. I send out polite inquiries to my matches, with plenty of data to help them work out the details, and I never get a response.

Both problems have a common cause. There too much data, not enough information and very little guidance from the DNA testing companies about how to use what they give you. They also do not provide much useful information about how unreliable their statistical estimates of relatedness are. This is, in part, because they are in business. Their marketing sets very broad expectations that with the tiny bits of information available from DNA testing that two people can connect their family trees, discover their shared lineage, develop large parts of their missing ancestry, and invite another 30 people to the next family reunion.

When I make my first response to new “relatives,” I try to explain that the process of making sense of DNA matches is complex and time-consuming. Many people don’t have the time or interest to learn the intricacies of genetic genealogy. I am not a professional genealogist and I don’t play one on TV.  I think that I first tested in 2017, and I have working on learning how to make use of the results, and to connect with family members, since then. What is difficult for most people who are new to this is that it is just not easy to draw meaningful conclusions more than 2 generations back without testing large numbers of family members and having a well-developed family tree going back more than 4 generations. Some genetic genealogists say you really need a family tree going back 10 generations.

The most heart-breaking inquiries are from adoptees who are searching to learn their true family history. They are hoping to make contact with someone who will give them the puzzle piece that will connect them to their past, and I really wish that I could give it to them easily. There are one or two places in my immediate family history where it is conceivably possible that an event in which a child was conceived out of wedlock happened within the past 3 generations. The data coming from DNA matches probably goes back further, and into areas in my family tree that have not been developed. I wish that I had an easy answer for you, or at least another breadcrumb in the trail that has to be followed. Unfortunately, it takes a lot of work, and most of the so-called leads that you will get from atDNA are going to be dead ends.

Exponential Relative Numbers

My genealogy database, as of this writing (October 2019) contains 2,157 individuals. Some of these are not blood relatives because I try to document some close spousal families and their descendent trees as well.

My DNA analysis database contains 102,300 matching “relatives.” That is the software designer’s choice of terms. Every individual who matches one or more of the DNA kits that I manage is treated as a “relative.” There is some duplication, on the order of three to four records per actual human. Let’s assume that it is four, on average. That is still over 25,000 people who are related to me, in some way, that have tested their DNA. Again, those numbers are from October 2019. I am going to have to create a second database and limit it to testing confirmed relatives because the “brute force download” database is unwieldy in search time and garbage data.

Here is some mathematics, to give you some perspective on the size of the haystack in which we are searching desperately for lost needles:

Assume, for a moment, that every ancestor in your family tree was an only child and each couple had exactly one child. These are the number of blood relatives that you would have in four generations.

RelationshipPeople in your tree
0 yourself1
1 your parents2
2 your grandparents4
3 great-grandparents8
4 great-great-grandparents16

Add that up and you have 31 people in your family tree in four generations. Ideally, genealogists like to have all the following information for each person: full name; date and place for each of the following events: birth, death and marriage. Do you know all this information about those 31 people? If so, you are off to a great start. You probably don’t have that many DNA matches because you are your only possible cousin.

Let’s make things slightly more realistic. Let us assume that every couple in your family tree has/had exactly two children. Let us revisit the table and count our cousins.

Remember that your first cousins share a common grandparent, your second cousins share a common great-grandparent and your third cousins share a common great-great-grandparent with you. The “once removed” stuff happens when you and the person in question are of different generations, relative to your shared ancestor.

Relationship namein your treeuncle / auntfirst cousinssecond cousinsthird cousinstotal down
0 You / Siblings2 04166486
1 Parents22832 44
2 Grandparents4416  24
3 Great-grandparents88   16
4 Great-great-grandparents16    16
total across3214284864186

So for finding first cousins, with only two children per couple, a family tree of fourteen people is adequate to find all relatives. They are highlighted in yellow. To find second cousins, you need a family tree of fifty people to find all relatives. Thos numbers are highlighted in orange and yellow. Do see where this is going? It’s not going to be pretty. To find third cousins you need a family tree of 186 people.  Those are the numbers in blue, orange and yellow. I can make a larger table, but it will not all fit here.

Now, modern families are relatively small, 2 or 3 children per couple. Go back three or four generations, and eight to ten children per couple was not uncommon. Not all of the children survived to child-bearing age, and not all who did had children, but these numbers get very big very fast.

There are tables that estimate how closely related you are to someone, based on the amount of shared DNA between you and your match. See the Shared CentiMorgan Project web site. There is broad variability, and the tables presume that there is no endogamous population involved.

If your DNA matches mine, you have some Ashkenazi Jews in your ancestry. The Ashkenazi Jewish population was highly endogamous until recent generations.  Endogamy skews the estimates significantly. Read on …

How Reliable Are the DNA Estimates?

It depends. If you have Ashkenazi Jewish (AJ) ancestry, those estimates become unreliable beyond “first cousins.”  This is becaus of endogamy. It was very unusual to marry outside the faith. People did not travel very far as they do now. The movement of Jews was also severely limited by where they were allowed to live and travel. As a result, the community of available spouses was very small. It was inevitable that blood relatives would inter-marry. All efforts were made to avoid marrying closer than second cousins to reduce deleterious genetic effects. However any intermarriage in an earlier generation completely mangles the statistical models that are used to estimate how closely two persons are related based on matching DNA segments. If your testing service says “2nd to 3rd cousin” you need to be thinking “4th to 5th cousin and maybe also a 6th cousin thrown in there.”

There will come a time, hopefully, when the DNA services will use markers to indicate endogamous populations, and use different models for generating those labels like “likely 2nd or 3rd cousins.”

My DNA is 98.6% AJ, based on the models used by 23andMe, and nearly 100% according to AncestryDNA‘s model. In the 1% to 2% ranges, you should consider most things in DNA for ancestry “noise” and ignore it. If you have a DNA match to one of the kits I manage, you are matching AJ DNA. That means that we are already farther apart than you have been led to believe.

Here is a very good article about the problem: “No, You Don’t Really Have 7,900 4th Cousins:  Some DNA Basics for Those With Jewish Heritage” by Jennifer Mendelsohn. She explains it much better than I can.

Getting Started

If you are new to genetic genealogy, you need to learn a lot to make sense of it. You also must have a family tree going back multiple generations to make use of it for finding distant relatives. If your family tree is sparse in the first three generations away from yourself, that is a much better place to be spending your time researching.

Sharing Your Family Tree with Me

There is no harm making initial contact with your matches. Be aware that the first thing that anybody with experience will ask you for, if you hear back from them, will be a request for your family tree data. The information that is needed is the following:

  • Name, birth date, birth place, death date and death place of each person.
  • Spouse names, marriage dates and places for each marriage, or partnership and approximate date ranges if there may have been a birth resulting (e.g. for adoptees seeking birth family).
  • Linking of parents to children.

That is the data that is included in a family tree. They can be displayed in many different formats. What is important is that the data is known … at least, parts of it.

That is the first thing that I will to ask you for to help research our matches.

If you have an online genealogy that is freely accessible, or if you have it in a machine-readable format, like a GEDCOM file, that will help. My genealogy software can read GEDCOM files and I can do some analysis on my local computer. There are also online tools that can compare GEDCOM files to find possible common matches.

If your online family tree obfuscates living persons for privacy reasons, you may have to send me some supplemental information to get me from the names of your living ancestors to the people who are visible in your family tree. If you are hiding data in your family tree for the sake of preserving your security in bank records,  please read this post – then go fix that problem with your bank, ASAP. That is more important than genealogy.

Entering the DNA Rabbit Hole

You will have to learn a little bit about genetic genealogy. It is not a lot. You don’t have to learn organic chemistry or molecular biology.  If you did OK in high school biology you can grasp this stuff with a little studying.

You do have to understand what terms liks SNP and cM and MRCA mean.  You should know the difference between atDNA, mtDNA, XDNA and YDNA and how each one is relevant to genetic genealogy.

The best place to get started is on the web site of the ISOGG, the International Society of Genetic Genealogy. They have a getting started page that links to many good resources. Bookmark that page now. Start to explore the wiki and the other pages there. I did say that this is a rabbit hole, like in Alice in Wonderland. It may be a while before you hit bottom.

The following genetic genealogy information that will help me to analyze our common ancestry. For each DNA sample or kit:

  • Which service was the match on?
  • What is the full name or identifier on that service of the person whose DNA kit matched?
  • Is the DNA kit for that person accessible on multiple DNA services? If so which ones?
  • If the kit is on multiple services, which ones are original sequences and which are uploaded kits?
  • Which of the DNA kits that I manage match? I manage DNA samples for myself, my mother and my sister. Each has a slightly different email address. You might have matched one, two or three kits.
  • Have you uploaded this DNA kit to GEDMatch? If so, please let me know the identifier or the email address you used to identify that kit. I had removed my DNA from GEDMatch … I am rethinking this … but some of the information on my pages is currently incorrect as far as GEDMatch searching goes. We can compare “research” kits which bypass the problem I discuss in my “Discontinuing DNA Genealogy” post.
  • Are there other known family members who have also tested their DNA? We will want all the above data for each of them that match one of my kits.
  • Have you successfully matched with any other relatives who are in a confirmed position in your family tree?

You can send links and short answers. If this information was in your contact email, I am already checking things as time allows.

My Information

It would be incredibly rude of me to ask you to send me lots of information and not tell you how to find mine. I have created a web page that contains links to my family trees online. It also contains the identifiers of the DNA kits that I manage.

Please feel free to contact me again to follow up. I try to respond to all messages in a timely fashion, although, as stated above … this has not always happened with “match” information. I’m trying to catch up!

Also please feel free to send me feedback on this page, either in direct contact, or add a comment below … or both. I would like to know if this helps … hurts … needs more detail … needs less detail … etc.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.