1. Evolutionary history of plant diversification.
2. Software development for evolutionary genomics research.
3. Methods for inferring phylogenetic networks.
4. Modeling the evolutionary effects of species interactions.
5. Modeling the accumulation of speciation genes.
Modern research in ecology and evolution requires a diverse skill set, from organismal biology, to genomics, statistics, and computational biology. My research and teaching centers on integrating these skills.
And it's an exciting time for this! Genomic technologies are revolutionizing the study of ecology and evolution.
And it's an exciting time for this! Genomic technologies are revolutionizing the study of ecology and evolution.
And it's an exciting time for this! Genomic technologies are revolutionizing the study of ecology and evolution.
And it's an exciting time for this! Genomic technologies are revolutionizing the study of ecology and evolution.
"invariants" are SNP patterns that occur equally (sensu word embedding).
Order SNP counts into 3 matrices for each quartet of samples.
One matrix contains more patterns matching the tree (e.g., AABB)
Two matrices contain patterns discordant with the tree (e.g., BABA, ABBA)
The matrix matching the tree aligns along rows/cols (invariants align).
SVD measures matrix filling (rank), and quickly finds best tree.
SVDquartets discards the other two subtrees which contain information relevant to inferring admixture, although we know the relative frequencies of these subtrees is informative about introgression (e.g., ABBA-BABA imbalance) (Durand et al. 2011)
SVDquartets only examines individual quartets at a time Because quartets are not independent of each other, introgression of one may affect multiple other quartets (Eaton et al. 2012, 2015). Ideally, all taxa, or all quartets, would be examined simultaneously.
Unique fingerprint for different admixture scenarios
ExtraTrees Classifier (sklearn) accurately identifies admixture edge and directionality with little training and requiring relatively little data (tens of thousands of SNPs).
Efficient handling of large data sets, clear and reproducible coding skills, statistical literacy, understanding of data within a specialized field (e.g. genomics).
Web-based environment for creating and sharing reproducible code. Our example explored tree-thinking and viewing trees as data using the toytree package in Python.
Reading trees involves interpreting the order in which lineages share common ancestors by tracing relationships backwards from the tips towards the root. Rotating nodes does not affect these relationships, even though the order of the tips changes. Which topology is different?
Phylogenetic trees are more than just pictures, they represent a data structure that can be interpreted and used in model-based analyses. Stored in Newick format.
Phylogenetic trees are more than just pictures, they represent a data structure that can be interpreted and used in model-based analyses. Stored in Newick format.
Species rich:
>600 species worldwide, approximately 300 endemic to Hengduan.
We collected >60 species from 100 locations in 2018.
Morphologically diverse:
Spectacular floral diversity and abundant homoplasy;
similar forms have evolved repeatedly.
Complex history of assembly:
Mountain uplift over millions of years, glacial cycles over
thousands of years, river and mountains barriers, lead to
constantly shuffling communities (and species
interactions).
Negative fitness consequences imposed by one organism on another by disrupting successful reproduction (a form of selection on reproductive traits/timing/behavior)
Does interspecific competition/interference drive floral divergence?
Is floral divergence associated with genetic divergence/speciation?
Elongate styles have evolved multiple times (Ree 2005) and facilitate pollen competition among species (Tong and Huang 2016).
Hypothesis: Differences among populations (within species) are a result of interspecific interactions driving character displacement in local communities.
110 individuals across 15 targeted locations.
RAD-seq (original) PstI enzyme, Floragenex Inc.
5.5M reads per sample; ipyrad min50 denovo assembly
20K loci, 21% missing, 286K SNPs
Lande (1976):
Selection pulls
the mean phenotype towards a local optimum, while
Gene Flow homogenizes phenotypes among populations,
and they evolve by stochastic
Drift.
P. cranolopha has a longer style when co-occurring with closer relatives; supports gametophytic "arms-race" hypothesis.