Tuesday 27 January 2015

Reasons for grouping into Genetic Families - E, G, I & R1a

- the rationale behind the allocation of people to each of the new Genetic Families

There are 4 haplogroups represented in the project (E, G, I, and R). A haplogroup is simply a group of people with a similar genetic signature. There are 20 Y-DNA haplogroups altogether, named after the letters A through S. Each of these haplogroups can in turn be subdivided into smaller and smaller subgroups (or “subclades”). And all of these groups can be placed on an evolutionary tree (the haplotree) that summarises the evolution of the human Y-DNA signature from its earliest origins in Africa to its arrival in Europe about 45,000 years ago, and Ireland about 13,000 years ago.

Genetic Families can be considered the smallest subgroups of this evolutionary tree - the final twigs on the branches of the haplotree.

But to start with, let’s take a look at each of these major haplogroups in turn, and the genetic families that have been identified within each one.

 The 20 Haplogroups on the Y-DNA Haplotree (from FTDNA)

Haplogroup E and G

The members belonging to these haplogroups are singletons (i.e. no close matches within the project) and have been allocated to the “Ungrouped (non-R)” category.

Haplogroup I

Of the 7 members in Haplogroup I, four have been grouped into 2 distinct genetic families and the remaining 3 are ungrouped singeltons.

I1-Genetic Family 1 (I1-GF1)

  • The 2 members of I1-GF1 both bear the surname Farrell.
  • They differ from each other by a GD of 0/37 and 1/67, which indicates a very close relationship between the two individuals, possibly as close as second or third cousins.
  • The TiP24 score [1] is 100%. In fact, their TiP Report suggests that there is a 95% chance that the common ancestor was born sometime within the previous 8 generations (which equates to sometime after about 1700 assuming 30 years between generations and a date of birth of the participants of about 1940).
  • Possible “rare” marker values: none obvious
  • The terminal SNP [2] for both these members is I-L205, placing them in the following subclade of the ISOGG Y-DNA tree 2015: I1a1b2 (see http://www.isogg.org/tree/ISOGG_HapgrpI.html)
  • Their MDKA information does not list any specific locations but further research may reveal that their MDKAs were born in the same area.
  • Given the estimated closeness of their relationship, these participants should share their genealogical data and try to ascertain who is their common ancestor or where he may have come from.

I1-GF1 and I1-GF2 (click to enlarge)

I1-Genetic Family 1 (I1-GF2)

  • The 2 members of I1-GF2 share the surname O’Farrell.
  • They have been grouped together despite the fact that member 103146 appears to have some trouble with his results. Specifically, some marker values are missing (the multi-copy markers DYS459, DYS464, and CDYa & b). I’m assuming that this is a technical issue. If one ignores these marker values, the two haplotypes are identical and therefore have been grouped together.
  • The TiP24 score for these two members is only 51% but this may be due to the technical error with the marker values (or alternatively they may be incorrectly grouped together). I will check with FTDNA and see if the error can be corrected. If the corrected data shows that the two haplotypes (i.e. genetic signatures) are identical, then these two individuals are probably very closely related and may share a common ancestor within the last 8 generations or so (i.e. since 1700).
  • Possible “rare” marker values: none obvious
  • The terminal SNPs for these two members are consistent, with P109 (subclade I1a1b1) being downstream of M170 (Hg I) - see http://www.isogg.org/tree/ISOGG_HapgrpI.html
  • The country of origin of both these members MDKA (Most Distant Known Ancestor) is given as Ireland (see Results in Classic mode). However, all other MDKA data is missing completely for one member and there are no locations mentioned for the second member, so it is not currently possible to see if the MDKAs for these two members came from the same part of Ireland. Both members should update their MDKA data accordingly.


Haplogroup R1a

The 2 members who belong to Hg R1a now form a distinct genetic family, namely R1a-Genetic Family 1 (R1a-GF1 for short).

  • Their surnames (Farr & Farrar) could well be variants of each other.
  • They differ by a GD of 4/37.
  • On the TiP calculator, the TiP24 score is 96.96% supporting the placement of these 2 members within the same genetic family.
  • Possible “rare” marker values:
  • DYS19 is 16 (occurs in only 10.6% of the general population, but 38% of the R1a population so this is not really rare)
  • DYS439 is 10 (occurs in only 8.5% of the general population, but 77% of the R1a population so this is not really rare)
  • DYS448 is 21 (occurs in only 15% of the general population, and only 2% of the R1a population so this is rare, and its presence in both individuals supports their being grouped together)
  • The terminal SNP’s for these two members are consistent, with Z93 (subclade R1a1a1b2) being downstream of M512 (subclade R1a1a) - see http://www.isogg.org/tree/ISOGG_HapgrpR.html
  • The members of this group are probably not very closely related to each other as the TiP tool estimates a 90% chance of a common ancestor within the last 20 generations approximately (91.77% in fact). That would mean that there is a roughly 90% chance that the common ancestor was born some time after 1340 (if one allows 30 years per generation and a dob of the members of about 1940). Equally, there is a 10% chance that their common ancestor was born before 1340 (approximately). Nevertheless, these members should share their genealogical data and try to ascertain who is the common ancestor or where he may have come from. If one or both of them have an extensive pedigree then they might get lucky.

R1a-GF1 (click to enlarge)


Next week we’ll take a look at the largest haplogroup within the project, Hapolgroup R1b, and the 5 genetic families that belong to it.






[1] The TiP24 score is the value obtained from the TiP Report at 24 generations with the following settings: 1) comparison set to the 37-marker level; 2) default settings (i.e. they do not share a common ancestor more recently than 1 generations ago; display every 4 generations). In this situation, the TiP Report is not being used to estimate the time to most recent common ancestor (TMRCA) but rather as a more accurate estimate of relative closeness than merely GD. This is because GD does not take into account the variable mutation rates of markers whereas the TiP Report does. This technique was developed by James Irvine and is used in his Clan Irwin Surname DNA Study (https://www.familytreedna.com/public/irwin).

[2] There are two types of marker on all chromosomes - SNP markers and STR markers. STR markers are the row of numbers you see on the Results page. SNP markers are a different type of marker and are used to subdivide members of a haplogroup into smaller and smaller subgroups/subclades. The terminal SNP is the marker that identifies the current end of a particular branch of the haplotree. More SNP markers will be discovered in time that will identify additional (smaller) subgroups further “downstream” from the current “terminal SNP”. In other words, the “terminal SNP” changes over time as more markers are discovered and their position on the haplotree is clarified.




Tuesday 20 January 2015

Criteria for allocating members to specific Genetic Families

Genetic Families are simply groups of people who have been grouped together because their genetic signatures are very similar to each other, suggesting a probable common ancestor within the previous several hundred years. Other indicators (such as similar surname, and other genealogical data) help corroborate this allocation. It may be possible (using traditional genealogical research and perhaps further DNA testing) to identify who this common ancestor is.

The Farrell DNA project currently has 88 members who have taken a Y-DNA test, of whom 31 have recently been allocated to new Genetic Families. The names of the new Genetic Families start with the broad Haplogroup (Hg) name (or Haplogroup subgroup name) followed by a specific number e.g. R1a - Genetic Family 1 (shortened to R1a-GF1).

Click for larger image


Who currently remains Ungrouped?

The DNA project currently has 57 members who remain ungrouped. Of these, 16 members have only tested out to 12 markers. All of these members have been left in the “Ungrouped” category as testing to only 12 markers does not provide enough information to reliably allocate these members to a specific genetic family. In addition, 5 members have only tested to 25 markers and most of these remain unallocated to a specific genetic family. Those ungrouped members who have only tested to 12 or 25 markers will need to upgrade to 37 markers before allocation can be reliably attempted.

Singletons (those with no close matches) have been placed in the Ungrouped category. This applies to the one member belonging to Haplogroup (Hg) E, the 2 members in Hg G, 3 members in Hg I, and 52 members in Hg R. For ease of reference, the Ungrouped category is further subdivided into “Ungrouped” (containing anyone who belongs to Hg R) and “Ungrouped (non-R)” which contains anyone in any of the other haplogroups.

The number of people in this large Ungrouped category will gradually fall as more people join the project and Ungrouped members can be paired up with a close match. This is why it is so important for those who have only tested out to 12 or 25 markers to upgrade their test to 37 markers (the Y-DNA-37 test). To upgrade, just follow the instructions on the Welcome page here … How to Upgrade

Non-Farrell members (i.e. those that do not have a Farrell surname or variant) are usually left in the Ungrouped category unless there is a very close match with a Farrell member, in which case it is highly possible that there has been an NPE (non-paternity event) somewhere along the direct male line going back to 1300 (when surnames began to be commonly used). Alternatively this could be a chance finding (e.g. an example of convergence) or the two people may be related via a common ancestor before the common usage of surnames (i.e. prior to 1300). 


Criteria for Allocation

Allocation of a particular member to a specific genetic family is based on the presence of some or all of the following criteria. These can be considered to be markers or indicators of a possible close connection, and  the more criteria that are present, the more likely that there is a real relationship between those members in that family within a genealogical timeframe (i.e. since the common usage of surnames, or about the last 700 years or so). These criteria consist of both traditional genealogical indicators as well as genetic indicators:
  1. The member has the surname Farrell or one of its putative variants
  2. The Genetic Distance (GD) between two members indicates a close or very close relationship e.g. 0-2 at 37 markers (the member's haplotype which is closest to the group modal haplotype is used as the main comparator)
  3. The TiP24 score [1] is >80% when compared against the group modal haplotype (useful for more distant matches e.g. GD = 3 or 4 at 37 markers) [2]
  4. The presence of "rare" marker values among group members [3]
  5. The results of SNP testing (if performed) are consistent among the members of the particular group (i.e. there is no evidence that some are on separate branches of the Y-Haplotree)
  6. The same surname variant is present / predominant in the particular group - see R1b-GF1 (Farley) and R1b-GF2a (Ferrell)
  7. The same MDKA is present in the particular group (e.g. see R1b-GF3)
  8. The same MDKA location is present in the particular group

You will see from the above criteria how important it is to include information about your MDKA (Most Distant Known Ancestor, a.k.a. Earliest Known Ancestor), including birth & death locations. This information can be used as corroborative evidence to support the allocation of members to a specific genetic family. As described in the previous blog post, the format for entering this information should be as follows:
John Farrell b1862 Longford d1926 New York

If you have specific questions about these criteria or any other aspect of the project, please email me or post them below in the Comments section, or post them on the Farrell Clan Facebook page. I will answer all questions either individually or devote a blog post to each one (so that everyone can benefit from the answer).

Next we’ll take a closer look at each of the new Genetic Families, one by one.

Maurice Gleeson
20 Jan 2015




[1] The TiP24 score is the value obtained from the TiP Report at 24 generations with the following settings: 1) comparison set to the 37-marker level; 2) default settings (i.e. they do not share a common ancestor more recently than 1 generations ago; display every 4 generations). In this situation, the TiP Report is not being used to estimate the time to most recent common ancestor (TMRCA) but rather as a more accurate estimate of relative closeness than merely GD (Genetic Distance). This is because GD does not take into account the variable mutation rates of markers whereas the TiP Report does. This technique was developed by James Irvine and is used in his Clan Irwin Surname DNA Study (https://www.familytreedna.com/public/irwin).

[2] The TiP24 score was set deliberately high for this initial phase of allocation in order to ensure that members in each group are highly likely to be closely related to each other and to minimize any risk (that there might be) of convergence. The TiP score might be relaxed at a later stage or in certain circumstances to allow more “outlying” members to enter each GF, or these might be allocated to a “b” version of the family, as has been done for R1b-GF2.

[3] "Rare" marker values can be considered to be values that are only shared by a very small percentage of people in that particular Haplogroup (or Haplogroup subdivision), and can be used to "define" a particular genetic family. The presence of these so-called "rare" values in a specific individual almost automatically indicates to which genetic family that individual belongs, frequently without any need to look at his other marker values. Kelly Wheaton is able to predict her Wheaton Group B participants on the basis of their first 5 markers as 3 of them are rare. This is a fine example of a very unusual situation where just testing to 12 markers is sufficient for allocation of a member to a particular genetic family - if your surname is Wheaton and you have the 3 rare marker values then you are 99.99% likely to belong to Wheaton Group B. The frequency of marker values in the general population can be accessed via the Sorenson Molecular Genealogy Foundation database and the YHRD database, whereas the marker value frequencies in some of the most common haplogroups (E3a, E3b, G, I J2, R1a, and R1b) can be accessed via Leo Little’s Y-STR Allele Frequency tables. The threshold for qualifying as rare may be quite arbitrary - Roberta Estes uses a cut-off of <25% for rare, and <6% for very rare, whereas Robert Casey uses weighted values to estimate the rarity of individual marker values and entire haplotypes.



Friday 16 January 2015

Essential information everyone should include on their FTDNA webpage

As a New Year’s Resolution, please could I ask all members of the Farrell DNA Project to at least include their MDKA information (i.e. Most Distant Known Ancestor) as this is a great help to other members and makes collaboration much easier. Unfortunately, 35 of the 88 members who have taken a Y-DNA test have no or minimal information about their MDKA on their Personal Profiles and this severely limits the usefulness of their results.


Adding your MDKA information

To update or add your MDKA details is very simple. Here is a step-by-step guide:

1. Click on your name in the top right corner of your FTDNA homepage, and then select Account Settings from the drop down menu


2. Then click on the Genealogy tab ... and Earliest Known Ancestors


3. Fill in the details for your Earliest Known Ancestor (EKA, also called MDKA, Most Distant Known Ancestor) on your direct paternal line (i.e. father, father, father, etc).

The following format is suggested for MDKA data. As well as birth year, it is important to include birth location (this assists subsequent analysis). If there is uncertainty about the birth location, put a question mark beside it e.g. Ballymoney? Co. Wicklow. 
John Farrell b1862 Longford 
Don’t include fullstops (periods), commas, or unnecessary spaces/gaps as you may run out of space - you are only allowed 50 characters. The most important information is the surname, the birth year, and the birth location. If space allows, you can add his wife, and death year.


And remember, this is your most distant KNOWN ancestor, and that you have documentary evidence to support that he did exist. In other words, please don’t include speculative ancestors here. 

4. Select your MDKA’s Country of Origin from the dropdown menu and select either Ireland or Unknown.

5. Enter your MDKA's birth location in the Paternal Ancestral Location field. The system should automatically generate latitude and longitude co-ordinates.

6. Scroll down to the bottom of the page and click on Save ... otherwise your changes will be lost!




Upload a Gedcom file of your Family Tree

It is also extremely helpful to upload a gedcom of your family tree to FTDNA. Having instant access to your key ancestry data helps other researchers collaborate with you.

To upload a gedcom is simple.

1. Click on your name in the top right corner of your FTDNA homepage



2. Then click on the link “Upload Family Tree” under the Surnames heading and follow the instructions. 


You can find out how to create a gedcom by clicking here - it all depends on the programme you are using to create your family tree. If you don't find your own particular programme, simply google its name and the phrase "create gedcom".


Sharing links to your online Family Tree

Another way of facilitating collaboration is to include links to any online family trees you might have. These links can be added to your Personal Profile page and I encourage all members to add links to all their online trees. Many people have trees on Ancestry.com but these can only be viewed by people who are already members of Ancestry or who have been specifically invited by the tree owner. So, in addition to posting a link to your Ancestry tree (or MyHeritage, GenesReunited, etc), I recommend also uploading a gedcom of your tree to Rootsweb, and posting that link too, as this is completely free and can be viewed by anyone.[1]

To add links to your online trees to your Personal Profile page, just do the following:

1. Click on your name in the top right corner of your FTDNA homepage



2. Click on the tab “Account Settings”



3. In the “About Me” box, enter the following information:
My family Tree can be found online at the following links (delete/amend as appropriate) … ANCESTRY - (put the link here) …….... MYHERITAGE - (link here) …….… ROOTSWEB - (link here)

4. Then put a tick / check “Basic Profile” underneath the “About Me” box (you can tick Full Profile if you prefer)

5. And lastly click “Upload” - if any of your matches click on your name, they will see this information in your Personal Profile


Here's my Personal Profile as an example of what my matches see when they click on my name.



Having family tree information available will encourage your genetic matches to get in touch with you and collaborate. Sharing genealogical information with your close matches is essential if you want to break down those Brick Walls. It is also important to research your Farrell line as far back as you can possibly go, and to have all your data referenced as fully as possible. This is obviously an ongoing task for all of us!

Remember, the DNA is only a pointer. The real work begins when it shows you who your genetic cousins are – these are the people with whom you need to collaborate.

Maurice Gleeson 
16 Jan 2015




[1]  To upload your gedcom to the free Rootsweb WorldConnect Project, just click here to set up a free account, and then click here (http://wc.rootsweb.ancestry.com/cgi-bin/wcus), click on “Add New Tree” and follow the instructions.




Wednesday 14 January 2015

Welcome to the blog

Welcome to the new blog for the Farrell DNA Project.

The DNA project has several objectives but one of the primary aims is to identify people who are close genetic cousins to each other so that these people can collaborate, share genealogical information, and thus help each other to break down the Brick Walls in their own particular family tree research.[1] If sufficient members join, it will hopefully be possible to determine the origins of the Farrell surname and its evolution over time.

This project is hosted on FamilyTreeDNA (FTDNA) and you can easily access it by either googling “Farrell FTDNA” or simply following this link here …
www.familytreedna.com/public/FARRELL DNA Project/

The project is open to anyone with a suspected Farrell ancestor, especially anyone whose surname is Farrell or one of its many possible variants, including any of the following names … Faddle, Farell, Farlee, Farley, Farr, Farris, Farral, Farrel, Farrell, Farrelly, Farrol, Fearghail, Ferrall, Ferrally, Ferrell, Ferrill, Frawley, McFaddle, McFarell, McFarley, McFarral, McFarrel, McFarrell, McFarrelly, McFarrol, McFearghail, McFerrally, McFerrell, McFerrill, McFrawley, O’Faircheallaigh, O'Faddle, O'Farell, O'Farley, O'Farral, O'Farrel, O'Farrell, O'Farrelly, O'Farrol, O'Fearghail, O'Ferrally, O'Ferrell, O'Ferrill, O'Frawley


Current Status

As of December 2014, the previous Project Administrator, Jim Denning, has retired and the new Project Administrator is Maurice Gleeson. I would like to say a big thank you to Jim for his many years of work on this project. 

Click for larger image

The membership has grown under Jim’s guidance and direction to 137 participants. The primary focus of the project is the Farrell surname and its inheritance along the direct male line (your father’s father’s father’s line). Hence Y-DNA is of particular importance and of the 137 members, 88 have undertaken a Y-DNA test, broken down as follows:
  • 111 markers – 12 members have bought this test
  • 67 markers – 32 members
  • 37 markers – 23 (this is the minimum recommended)
  • 25 markers only – 5
  • 12 markers only – 16
A Y-DNA-37 test is the minimum recommended for reliable allocation to a particular genetic family and all members are encouraged to upgrade to at least this 37-marker test. 


How to upgrade

If you wish to upgrade from 12 or 25 markers to 37 markers or higher, just click on the blue Upgrade button on your FTDNA Homepage and then click on the appropriate Upgrade Price button on the right and follow the instructions ($99 for upgrading to Y-DNA-37 from Y-DNA-12).




How to join the project

If you have already purchased a Y-DNA-37 test, you can join the project by simply clicking here and choosing Option A … 

New members can join the project by clicking here and purchasing a Y-DNA-37 test (there is a discount of $20 off the usual price if you buy it here, so you get it for $149 instead of $169) … https://www.familytreedna.com/group-join.aspx?&group=Farrell&vGroup=FARRELL%20DNA%20Project




Recent changes

Following a review of the current status of the project, there has been some changes to the way the Project Webpage looks. Firstly, statistical data has been added to the home page of the project website on FTDNA. This gives a very useful overview of the status of the project and the current membership. Also, information on the MDKA (Most Distant Known Ancestor) for project members is now visible on the Y-DNA Results pages (in both the Classic View and Colorized View). 

But the most significant change is the rearrangement of the project members to reflect recent additions to the project and new Y-DNA data. As a result, 8 distinct genetic families have now been identified. Subsequent blogs will discuss the criteria for allocating members to these new groups and we will take a much closer look at each of these 8 individual genetic families.


Any questions?

If you have specific questions about the Farrell DNA Project, please post them below in the Comments section, or post them below my announcement for each new blog post on the Farrell Clan Facebook page. I will answer all questions either individually or devote a blog to each one (so that everyone can benefit from the answer).

Maurice Gleeson
14 Jan 2015


[1] A complete list of the revised objectives of the DNA project can be found here … https://www.familytreedna.com/public/FARRELL%20DNA%20Project/default.aspx?section=goals