2. Genealogy, gene trees and species trees
3. Trees as data
4. Phylogenetics research
5. Sequence data and phylogenomics
6. Likelihood and model-based inference
Are the characters descended from a common ancestor? Has the character state evolved multiple times? Does the evolutionary relationships of genes controlling this trait match the evolutionary relationships of the species?
Discrete data: DNA is discrete, there are only four possible states (A,C,G,T). But what about body mass, height, color, etc.
Characters weighting: Is one trait more informative than another? How to avoid biased sampling of characters?
Why not only use DNA?: For phylogenetics of extant taxa we mostly do. But morphology is still very relevant to phylogeny of extinct taxa (e.g., dinosaurs) and placing fossils with extant taxa
Essential to all fields of biology.
Simple but commonly done incorrectly.
- describing taxa as being "basal".
- misinterpreting node rotations as meaningful.
- misinterpreting relationships.
- misinterpreting branch lengths.
- misinterpreting rooting of tree.
- failing to read figure captions when present.
Reading trees involves interpreting the order in which lineages share common ancestors by tracing relationships backwards from the tips towards the root. Rotating nodes does not affect these relationships, even though the order of the tips changes. Which topology is different?
Phylogenetic trees are more than just pictures, they represent a data structure that can be interpreted and used in model-based analyses. Stored in Newick format.
Methods of phylogenetic inference, and model-based historical inferences using trees, are both highly active areas of research. Many new methods are published in the journal of Systematic Biology, while countless applied examples are published in various journals, including Evolution, Molecular Biology and Evolution, Molecular Phylogenetics and Evolution, Molecular Ecology, etc.
Collect/measure homologous characters for some number of taxa. For DNA, identifying homology typically involves targeting regions of the genome using primers, or mapping sequenced reads from the genome to the same region of a reference genome. Either way, it is based on sequence similarity. This is typically followed by a more rigorous multiple sequence alignment.
A general outline of phylogenetic inference:
1. Propose a starting tree (e.g., random or star).
2. Score based on some criterion (e.g., parsimony, likelihood, distance).
3. Modify to propose a new tree, return to step 2.
A character matrix and a topology. Count the number of character state changes.
It does not account for homoplasy (repeated mutations to the same site). Or other patterns of varying rates of character change.