Bipartition methods¶
A phylogenetic tree can be split into two trees along any branch connecting two sections of the tree. This creates bipartitions
, and there can many possible bipartitions belonging to a single tree. Many algorithms compare tips (or internal Nodes) on either side of each split to compute metrics on trees. The iter_bipartitions()
function is available in the toytree.enum
subpackage, and aims to provide a flexible and fast framework for yielding bipartitions in various formats. These bipartitions are generated in Node idx traversal order, and are returned as iterable tuples each containing two items.
The items within these tuples can be specified to be represented by a particular feature (e.g., set, list, or tuple) using the argument type
The order of the tuples returned can also be ordered using the argument sort
Quick example¶
Here a simple toytree is generated from a newick string, and then split into possible bipartitions. The iter_bipartitions()
method generates multiple bipartitions which you can iterate through as you do a list or tuple.
import toytree
#generate tree
tree = toytree.tree("(a,b,((c,d)CD,(e,f)EF)X)AB;")
print(tree.get_tip_labels())
tree.draw();
['a', 'b', 'c', 'd', 'e', 'f']
#iterate through bipartitions
for bipartition in tree.iter_bipartitions():
print(bipartition)
({'d', 'c'}, {'b', 'f', 'e', 'a'}) ({'f', 'e'}, {'b', 'd', 'a', 'c'}) ({'f', 'e', 'c', 'd'}, {'b', 'a'})
By default, singleton splits (e.g., (A | B,C,D)) are not included because it is implicit that one exists for each tip of the tree. However, if these are important for an iteration then they can be included using the argument include_singleton_partitions=true
#include singleton splits
for bipartition in tree.iter_bipartitions(include_singleton_partitions=True):
print(bipartition)
({'a'}, {'b', 'e', 'f', 'd', 'c'}) ({'b'}, {'e', 'a', 'f', 'd', 'c'}) ({'c'}, {'b', 'e', 'a', 'f', 'd'}) ({'d'}, {'b', 'e', 'a', 'f', 'c'}) ({'e'}, {'b', 'a', 'f', 'd', 'c'}) ({'f'}, {'b', 'e', 'a', 'd', 'c'}) ({'d', 'c'}, {'b', 'f', 'e', 'a'}) ({'f', 'e'}, {'b', 'd', 'a', 'c'}) ({'f', 'e', 'c', 'd'}, {'b', 'a'})
You can also choose to include the internal nodes in each bipartition using include_internal_nodes=True
#include singleton splits, internal nodes
for bipartition in tree.iter_bipartitions(sort=True, include_internal_nodes=True, include_singleton_partitions=True):
print(bipartition)
({'a'}, {'b', 'e', 'f', 'CD', 'EF', 'X', 'd', 'AB', 'c'}) ({'b'}, {'e', 'a', 'f', 'CD', 'EF', 'X', 'd', 'AB', 'c'}) ({'c'}, {'b', 'e', 'a', 'f', 'CD', 'EF', 'X', 'd', 'AB'}) ({'d'}, {'b', 'e', 'a', 'f', 'CD', 'EF', 'X', 'AB', 'c'}) ({'e'}, {'b', 'a', 'f', 'CD', 'EF', 'X', 'd', 'AB', 'c'}) ({'f'}, {'b', 'e', 'a', 'CD', 'EF', 'X', 'd', 'AB', 'c'}) ({'d', 'CD', 'c'}, {'b', 'e', 'a', 'f', 'EF', 'X', 'AB'}) ({'f', 'e', 'EF'}, {'b', 'a', 'CD', 'X', 'd', 'AB', 'c'}) ({'b', 'a', 'AB'}, {'e', 'f', 'CD', 'EF', 'X', 'd', 'c'})
Returning different features¶
The iter_bipartitions()
method allows you to retreive the bipartitions in the form of any feature belonging to the toytree
object using the feature
argument. The function defaults to returning the name of the node, but you can pass in other features such as idx
. You can also pass in feature=None
to return the Node
object itself.
tree.features
('idx', 'name', 'height', 'dist', 'support')
#default: name
for name in tree.iter_bipartitions():
print(name)
#index
for idx in tree.iter_bipartitions(feature='idx'):
print(idx)
#height
for height in tree.iter_bipartitions(feature='height'):
print(height)
#Node object feature=None
for height in tree.iter_bipartitions(feature=None):
print(height)
({'D', 'C'}, {'B', 'A'}) ({2, 3}, {0, 1}) ({0.0}, {1.0, 2.0}) ({<Node(idx=2, name='C')>, <Node(idx=3, name='D')>}, {<Node(idx=1, name='B')>, <Node(idx=0, name='A')>})
Note: Returning different features does not change how sorting is done within and between bipartitions, which will be described next.
Sorting¶
By default, bipartitions are returned as (child, parent) order given the topology and rooting in Node idx order traversal. However, if sort=True
, then ipartitions are instead always sorted first by length, e.g., (fewer, longer) and if the same length, then next by the lowest alphanumeric tip name, e.g., ({'a', 'b'}, {'c', 'd'}).
Also, if the requested partition type
is sortable (i.e., not a set), then items within a partition are also consistently sorted regardless of the sort
argument.
#generate tree
tree = toytree.tree("(A,B,(C,D,E));")
print(tree.get_tip_labels())
#generate list of bipartitions
unsorted = list(tree.iter_bipartitions())
print(f"unsorted: {unsorted}")
#generate sorted list of bipartitions
sorted = list(tree.iter_bipartitions(sort=True))
print(f"sorted: {sorted}")
['A', 'B', 'C', 'D', 'E'] unsorted: [({'D', 'C', 'E'}, {'B', 'A'})] sorted: [({'B', 'A'}, {'D', 'C', 'E'})]
#generate tree
tree = toytree.tree("(A,(B,(C,D)));")
print(tree.get_tip_labels())
#generate list of bipartitions
unsorted = list(tree.iter_bipartitions())
print(f"unsorted: {unsorted}")
#generate sorted list of bipartitions
sorted = list(tree.iter_bipartitions(sort=True))
print(f"sorted: {sorted}")
['A', 'B', 'C', 'D'] unsorted: [({'D', 'C'}, {'B', 'A'})] sorted: [({'B', 'A'}, {'D', 'C'})]
Here, the bipartitions are the same size, so they are placed in alphabetical order. Next, you can see that if the type is sortable (e.g. type=list
, type=tuple
), then the tips in each bipartition are also sorted. This happens regardless of the sort
argument.
#generate tree
tree = toytree.tree("(A,(B,(C,D)));")
print(tree.get_tip_labels())
#generate list of bipartitions
default = list(tree.iter_bipartitions())
print(f"unsorted: {default}")
#generate sorted list of bipartitions
sortable = list(tree.iter_bipartitions(type=list))
print(f"sorted: {sortable}")
['A', 'B', 'C', 'D'] unsorted: [({'D', 'C'}, {'B', 'A'})] sorted: [(['C', 'D'], ['A', 'B'])]