Introgression analysis with RAD-seq data


Many terms in the literature are used interchangeably in the discussion of gene flow. Several popular terms that are often confused, but sometimes have distinct meanings, are admixture, introgression, and hybridization.

Admixture: A detectable genetic signature of mixed ancestry from one or more distinct lineages. It is a pattern that can be detected. It is possible for two populations to appear admixed even if they have not exchanged genetic material.

Introgression: The exchange of alleles from one or more distinct lineages into another. It is a process that occurs in time. We rarely observe introgression, and therefore we aim to reconstruct introgression events in the past.

hybridization: Successful reproduction between individuals from two distinct lingeages. We often observe hybridization among extant individuals, but further investigation is necessary to determine whether hybridization (e.g., F1s) result in introgression of alleles between lineages.

Drift-based statistics for modelling admixture

Methods for detecting admixture within and between populations typically rely on the assumption that existing variation in populations is a result of the sorting of ancestral standing variation in the ancestor of those populations. Therefore it does not account for de novo mutations. Examples include STRUCTURE and TREEMIX.

Phylogenetic models for modelling admixture

The D-statistics (ABBA-BABA) tests.


Full-genome implementations

The software admix-tools is often used to calculate ABBA-BABA scores from genome wide sequence alignments where the value if often measured using a sliding-widow approach along the genome.

RAD-seq implementation

Because RAD-seq provides only a subset of the genome the data must be analyzed a little differently than for full genome data.

Phylogenetic invariants and tetrad

A more general implementation of the general markov model.