Global research program contributing thousands of specimens to herbaria
Development of software tools for research and education
Decreasing costs have made it relatively easy to generate large genomic datasets
Characterize whole genomes from a subset of sequenced markers.
It is important to examine evolutionary history across the entire genome.
Introgression is common throughout the history of many lineages.
Assemble and analyze RAD-seq type data for phylogenetic datasets.
I teach Programming for Biology ("Hack the Planet") to teach students basic coding skills while guiding them through the process of developing and distributing a software tool.
Shadie: A Python wrapper to perform SLiM simulations of plant life cycles.
Traversome: Hybrid PanGenome Assembler from Mixed Samples
superMCC: Iteratively applies BPP to calibrate node ages on large trees
ipcoal: integrates msprime coalescent simulations with species tree & network inference.
toytree: Python-based Tree object, manipulation, visualization, and evol. analysis library.
Genomes are composed of a mosaic of segments inherited from different ancestors,
each separated by past recombination events.
Consequently, genealogical relationships vary spatially across genomes.
The multispecies coalescent (MSC) describes the expected distribution of unlinked genealogies, as a function of demographic model parameters (N$_e$, $\tau$, topology).
The multispecies coalescent (MSC) describes the expected distribution of unlinked genealogies, as a function of demographic model parameters (N$_e$, $\tau$, topology).
The expected distribution of linked genealogical variation is poorly characterized.
(Martin & Belleghem 2017)
An approximation of the coalescent with recombination
Given a starting genealogy a change to the next genealogy is modeled as a Markov process — a single transition — which enables a tractable likelihood framework.
Process: recombination occurs w/ uniform probability anywhere on a tree (t$_{1}$), creating a detached subtree, which re-coalesces above t$_{1}$ with an ancestral lineage.
PSMC (Li & Durbin 2011), MSMC (Schiffels & Durbin 2014), use pairwise coalescent times between sequential genealogies to infer changes in N$_e$ through time.
ARGweaver (Rasmussen et al. 2014) and ARGweaver-D (Hubisz & Siepel 2020) use an SMC'-based conditional sampling method to infer ARGs from sequence data.
(a) no-change; (b-c) tree-change; and (d) topology-change.
(Deng et al. 2021)
Expected Tree and Topology Distances represent new spatial genetic information.
Expected Tree and Topology Distances represent new spatial genetic information.
Expected Tree and Topology Distances represent new spatial genetic information.
Barriers to coalescence and variable N$_e$ among species tree intervals.
Patrick McKenzie
PhD student
Genealogy embedding table with piecewise constant coal rates in
all intervals between coal events or population intervals.
Unlike single-pop models which exhibit monotonic probabilities over the length of a branch, MSC models exhibit variable rates (both $k$ and N$_e$ can change).
Expected number of sites until a recombination event is observed.
Analytical results match expectation of stochastic coalescent simulations.
Calculate the likelihood of an ARG given a species tree (S)
Topology-changes are more informative than tree-changes; optima at true sim. values.
Example: loci=50, length=0.1Mb, recomb=2e-9, samples-per-lineage=4.
Metropolis Hastings MCMC converges on correct w/ increasing data.
Example: loci=50, length=0.1Mb, recomb=2e-9, samples-per-lineage=4.
Negative fitness consequences imposed by one organism on another by disrupting successful reproduction.
Have evolved multiple times independently (Ree 2005) and facilitate pollen tube competition (Tong and Huang 2016).
Transcription response in styles and pollen tubes during con- and heterospecific crosses in natural communities at RMBL in Colorado.
Linking Phylogenetic Inference at Genome-wide and Genealogical Scales
Linking Phylogenetic Inference at Genome-wide and Genealogical Scales
Linking Phylogenetic Inference at Genome-wide and Genealogical Scales
Linking Phylogenetic Inference at Genome-wide and Genealogical Scales
Linking Phylogenetic Inference at Genome-wide and Genealogical Scales
Divergent selection is greater between populations in sympatry than allopatry (e.g., benthic/limnetic sticklebacks) to reduce competition for limited resources.
The challenge/opportunity in Pedicularis is that there are many interacting species, and many have convergent phenotypes. We need a community model of character displacement.
Hypothesis: Differences among populations (within species) are a result of interspecific interactions driving character displacement in local communities.
Lande (1976):
Selection pulls
the mean phenotype towards a local optimum, while
Gene Flow homogenizes phenotypes among populations,
and they evolve by stochastic
Drift.
Phenotypic model is a poor fit compared to phylogenetic nearest neighbor.
P. cranolopha tends to have a longer style when co-occurring with a close relative.
P. cranolopha species complex is taxonomically challenging. Split into species/subspecies based on style length, pubescence, and presence of a "forked beak". But variation is relatively continuous.
Hybrid zones: contact between populations with "forked beak" and without.