toytree.Node¶

The toytree.Node class is primarily used for data storage. Minimally, it contains attributes storing a .name, .dist (edge length), and .support values, as well as attributes .up and .children which point to other Node objects to represent connections between them.

A single Node instance is generally of little use, it is only when nodes form connections that they have emergent properties in the form a network/tree structure. Thus, most methods in the toytree library are associated with ToyTree objects which are a container around a collection of Nodes. However, Node objects themselves are important to understand as the underlying object storing data within trees. This section describes the structure of Node objects and the design behind their intended use.

In [1]:

Copied!

import toytree
import toytree

In [2]:

Copied!

# create an example tree
tree = toytree.rtree.rtree(ntips=8, seed=321)
tree.draw('c');
# create an example tree
tree = toytree.rtree.rtree(ntips=8, seed=321)
tree.draw('c');

The Node class¶

The Node class is accessible from toytree.Node and can be used to create new instances or to check or validate the type of a Node instance. Unless you are a developer you are not likely to create new Node objects often, but instead will most often interact them by selecting them from within ToyTrees.

In [3]:

Copied!

# create a new Node
single_node = toytree.Node(name="single")
single_node
# create a new Node
single_node = toytree.Node(name="single")
single_node

Out[3]:

<Node(name='single')>

In [4]:

Copied!

# select a Node from a ToyTree
node3 = tree[3]
node3
# select a Node from a ToyTree
node3 = tree[3]
node3

Out[4]:

<Node(idx=3, name='r3')>

In [5]:

Copied!

# check that an object's type is a Node 
isinstance(node3, toytree.Node)
# check that an object's type is a Node 
isinstance(node3, toytree.Node)

Out[5]:

True

Attributes¶

name: str¶

The default name attribute is an empty string. Leaf nodes usually have names associated with them whereas internal nodes usually do not. This will depend on the data that a tree is parsed or constructed from, and whether additional names are added. Some characters are not allowed in node names ([:;(),\[\]\t\n\r=]) as they would interfere with Newick string parsing when written to a file. Names can be accessed from a Node's .name attribute, and can be used to query nodes from a ToyTree.

In [6]:

Copied!

# a name can be accessed from a Node
single_node.name
# a name can be accessed from a Node
single_node.name

Out[6]:

'single'

In [7]:

Copied!

# a name can be accessed from a Node in a ToyTree
tree[3].name
# a name can be accessed from a Node in a ToyTree
tree[3].name

Out[7]:

'r3'

In [8]:

Copied!

# returns .name from Nodes in the order they will be plotted (idxorder)
tree.get_tip_labels()
# returns .name from Nodes in the order they will be plotted (idxorder)
tree.get_tip_labels()

Out[8]:

['r0', 'r1', 'r2', 'r3', 'r4', 'r5', 'r6', 'r7']

idx: int¶

The default idx attribute is an int value of -1, which means that the node is not part of a ToyTree. If a node is in a ToyTree then it will be assigned a unique idx integer between 0 and nnodes-1. The leaf nodes in a tree have idx values between 0 and ntips - 1, and all internal nodes are labeled by increasing numbers in a post-order left-then-right traversal. This is termed an idxorder traversal. When a tree structure changes (e.g. during re-rooting) the idx values of nodes are updated and can change (see Traversal). A node's idx value can be checked from its .idx attribute, or if it is in a ToyTree then by calling .get_node_data() or plotting the tree to visualize idx values.

In [9]:

Copied!

# a Node that is not part of a ToyTree has idx=-1
single_node.idx
# a Node that is not part of a ToyTree has idx=-1
single_node.idx

Out[9]:

-1

In [10]:

Copied!

# Nodes in a ToyTree have unique idx values between 0 and nnodes - 1
node3.idx
# Nodes in a ToyTree have unique idx values between 0 and nnodes - 1
node3.idx

Out[10]:

dist: float¶

The default dist attribute is a float of 0. The value represents the distance from a node to its parent. In other words, it is the length of an edge connecting them. The dist attribute is thus not actually a feature of a node, but of an edge between nodes, but is nevertheless stored to a Node object. We call this an edge_feature of a Node, since it will change if the tree is re-rooted, changing which Node is parent to another. The value of a dist can range from very small to very large values, such as when representing the expected number of substitutions per site on a phylogeny, or divergence times in millions of years.

In [11]:

Copied!

# default Node dist
single_node.dist
# default Node dist
single_node.dist

Out[11]:

0.0

In [12]:

Copied!

# the dist from node 3 to its parent
node3.dist
# the dist from node 3 to its parent
node3.dist

Out[12]:

1.0

support: float¶

The default support value is numpy.nan, which represents the absence of support information. Tip (leaf) nodes are not expected to have support information, since they do not represent a split in a tree. Similarly, the root node support is nan since it does not represent a true split.

In [13]:

Copied!

# the default support value
single_node.support
# the default support value
single_node.support

Out[13]:

nan

up: Node¶

The .up attribute references a node's parent. The default value is None. This is also the value of the .up attribute of the root Node in a ToyTree, since it has no parent. A Node can only have one parent. If a tree is re-rooted the relationship between nodes can change such that a Node that was previously a child can become a parent, and thus the Node attributes are automatically updated during this process.

In [14]:

Copied!

# the default .up is None (no value is returned here)
single_node.up
# the default .up is None (no value is returned here)
single_node.up

In [15]:

Copied!

# node3's parent is Node 10
node3.up
# node3's parent is Node 10
node3.up

Out[15]:

<Node(idx=10)>

In [16]:

Copied!

# the parent of node3's parent is Node 11
node3.up.up
# the parent of node3's parent is Node 11
node3.up.up

Out[16]:

<Node(idx=13)>

children: tuple¶

The .children attribute is a tuple of zero or more Node objects that are descended from a node. The default is an empty tuple. If a tree is re-rooted the relationship between nodes can change such that a Node that was previously a child can become a parent, and thus the Node attributes are automatically updated during this process.

In [17]:

Copied!

# this single Node has no children
single_node.children
# this single Node has no children
single_node.children

Out[17]:

()

In [18]:

Copied!

# internal Node 8 in the tree has two children
tree[8].children
# internal Node 8 in the tree has two children
tree[8].children

Out[18]:

(<Node(idx=0, name='r0')>, <Node(idx=1, name='r1')>)

height: float¶

The default height value is a float of 0. The height of a Node is an emergent property of a tree of connected nodes. It is the height above the node that is the farthest distance from the root. This value is automatically updated for every node in a ToyTree when a tree is modified during the cached traversal.

In [19]:

Copied!

# single node has not height
single_node.height
# single node has not height
single_node.height

Out[19]:

0.0

In [20]:

Copied!

# leaf node 3 height
node3.height
# leaf node 3 height
node3.height

Out[20]:

1.0

In [21]:

Copied!

# internal node 8 height
tree[8].height
# internal node 8 height
tree[8].height

Out[21]:

2.0

Methods¶

The Node object provides a number of functions for fetching information about a node's position relative to other connected nodes. Some of this information is also accessible from a ToyTree object, but is sometimes easier to access it from a Node object directly.

In [22]:

Copied!

node3.is_leaf()
node3.is_leaf()

Out[22]:

True

In [23]:

Copied!

node3.is_root()
node3.is_root()

Out[23]:

False

In [24]:

Copied!

node3.get_ancestors()
node3.get_ancestors()

Out[24]:

(<Node(idx=10)>, <Node(idx=13)>, <Node(idx=14)>)

In [25]:

Copied!

node3.get_descendants()
node3.get_descendants()

Out[25]:

(<Node(idx=3, name='r3')>,)

In [26]:

Copied!

node3.get_leaves()
node3.get_leaves()

Out[26]:

[<Node(idx=3, name='r3')>]

In [27]:

Copied!

node3.get_sisters()
node3.get_sisters()

Out[27]:

(<Node(idx=4, name='r4')>,)

In [28]:

Copied!

node3.get_leaf_names()
node3.get_leaf_names()

Out[28]:

['r3']

Each of the get_[x] functions above is also available as a generator function named iter_[x], which is more efficient for fetching such data over very large trees, or for terminating a traversal over part of the tree once a condition has been met. The traverse() function is also a generator function.

In [29]:

Copied!

node3.iter_ancestors()
node3.iter_ancestors()

Out[29]:

<generator object Node.iter_ancestors at 0x7f3ba6ca9bd0>

In [30]:

Copied!

node3.traverse("idxorder")
node3.traverse("idxorder")

Out[30]:

<generator object Node._traverse_idxorder at 0x7f3ba6ca9d20>

Nodes vs 'Edges'¶

Notably, toytree does not implement a separate "Edge" class to represent edges in a tree. Instead, edges are simply represented by the connections between Node objects -- by their .up and .children attributes. (This can be important when storing new data types to a tree; see Edge features). Thus you can think of edges as pairs of nodes. You can fetch the edge information from a ToyTree in a variety of ways. Below we use the function get_edges which has options for returning this information in a number of tabular formats.

In [31]:

Copied!

# edges are simply pairs of Nodes with a child,parent relationship
tree.get_edges(feature='idx', df=True)
# edges are simply pairs of Nodes with a child,parent relationship
tree.get_edges(feature='idx', df=True)

Out[31]:

	child	parent
0	0	8
1	1	8
2	2	9
3	3	10
4	4	10
5	5	11
6	6	11
7	7	12
8	8	9
9	9	14
10	10	13
11	11	12
12	12	13
13	13	14

Mutability of Nodes¶

The data assigned to nodes may represent a feature of the node itself, or it may represent a feature of the edge connecting that node to its parent. In the latter case, it is important that the data be treated appropriately if the tree is modified, such as when a node is pruned from the tree, or the tree is re-rooted. In these cases, the edge features, such as the .dist, .support, and the connection information .up and .children, need to be automatically updated. Similarly, emergent properties of nodes in a tree, such as the .height of a node relative to the farthest leaf must be re-computed.

The automatic updating of these attributes is done at the level of a ToyTree, not within individual Nodes, and thus we have intentionally designed these elements of Node objects to be immutable (you cannot modify them directly). Thus, users cannot call node.idx = 3 or node.height = 100 to set these atrributes to a new value, since these attributes are properties of the node's placement with respect to other nodes in the tree, which need to also be updated. If you try to set one of these values a ToyTreeError exception will be raised like in the example below where we catch the exception and print it. For developers there is a simple workaround for this described further below.

In [32]:

Copied!





# catch 'ToyTreeError' exception raised when trying to modify a Node attribute
try:
    single_node.idx = 10
except toytree.utils.ToytreeError as exc:
    print("ToyTreeError:", exc)
# catch 'ToyTreeError' exception raised when trying to modify a Node attribute
try:
    single_node.idx = 10
except toytree.utils.ToytreeError as exc:
    print("ToyTreeError:", exc)

ToyTreeError: Cannot set .idx attribute of a Node. If you are an advanced user then you can do so by setting ._idx. See the docs section on Modifying Nodes and Tree Topology.

Calling mod functions¶

Instead of modifying a node's attributes directly you should instead call one of the tree modification functions from the toytree.mod subpackage that will ensure that the rest of the tree data is automatically updated along with the modified node data. Examples include the .root, .drop_tips, prune, ladderize, rotate_nodes, edges_set_node_heights, and many others which modify one or more .up, .children, .idx, .dist, or .height attributes of nodes in unison.

In [33]:

Copied!

# an example toytree.mod function that modifies node attributes
rtree = tree.mod.root("r4")

# the new tree has different idx values b/c the traversal order changed
toytree.mtree([tree, rtree]).draw(ts='p');
# an example toytree.mod function that modifies node attributes
rtree = tree.mod.root("r4")

# the new tree has different idx values b/c the traversal order changed
toytree.mtree([tree, rtree]).draw(ts='p');

Developing mod functions¶

Sometimes, however, you may really want to directly modify one or more core features of a Node, in which case it is possible, we just want to make sure that you are well aware of the necessary considerations to avoid errors in your code. You can examine the source code of the many .mod subpackage functions above for examples. Each of these core attributes is available as a private attribute (e.g., ._dist, ._idx) which can be modified without raising an exception. The key, however, is that after one or more private node attributes have been modified, the ToyTree traversal caching function named ._update() must be called at the end to ensure that all of the linked attributes of nodes are updated.

In [34]:

Copied!





# create a new tree copy
modtree = tree.copy()

# modify one or more private node attributes
modtree[0]._dist += 2
modtree[1]._dist += 3

# call update to update idxs, heights, etc.
modtree._update()

# show the old and new tree with longer .dists for nodes 0,1 and .heights for all nodes
toytree.mtree([tree, modtree]).draw(ts='p', scale_bar=True);
# create a new tree copy
modtree = tree.copy()

# modify one or more private node attributes
modtree[0]._dist += 2
modtree[1]._dist += 3

# call update to update idxs, heights, etc.
modtree._update()

# show the old and new tree with longer .dists for nodes 0,1 and .heights for all nodes
toytree.mtree([tree, modtree]).draw(ts='p', scale_bar=True);

Building trees from Nodes¶

There are several ways of constructing trees in toytree from scratch. This most simple is to use one of the random tree generation functions from the toytree.rtree subpackage. A second method is to write a Newick string and parse it using the toytree.tree function. A third is to build or modify a tree using one or more functions from toytree.mod such as .add_child_node. And finally, the fourth method is to link together Node objects manually. The last is the most low-level method, which requires eventually calling ToyTree._update() to cache the traversal order and store idx values. Each of these is demonstrated below.

Generate random or fixed trees. See the rtree documentation section for more details. This includes options to generate trees under a variety of algorithms and of different sizes.

In [35]:

Copied!

# generate a 6-tip balanced tree with crown height of 1M units
toytree.rtree.baltree(6, treeheight=1e6).draw(scale_bar=True);
# generate a 6-tip balanced tree with crown height of 1M units
toytree.rtree.baltree(6, treeheight=1e6).draw(scale_bar=True);

Parse a Newick string to generate a tree from scratch with desired characteristics.

In [36]:

Copied!

# generate a ToyTree with this specific data
toytree.tree("(((a:3,b:2):1),(c:3,d:2):5);").draw(scale_bar=True);
# generate a ToyTree with this specific data
toytree.tree("(((a:3,b:2):1),(c:3,d:2):5);").draw(scale_bar=True);

Modify a tree using one or more toytree.mod functions:

In [37]:

Copied!





# get a 4-tip balanced tree
tree4 = toytree.rtree.baltree(4)

# add a new sister (internal and tip node) to tip node 'r1'
modtree4 = toytree.mod.add_internal_node_and_child(tree4, 'r1', name="child", parent_name="parent")

# draw to highlight new parent and child nodes
modtree4.draw('r', node_mask=modtree4.get_node_mask(5), node_colors="lightgrey");
# get a 4-tip balanced tree
tree4 = toytree.rtree.baltree(4)

# add a new sister (internal and tip node) to tip node 'r1'
modtree4 = toytree.mod.add_internal_node_and_child(tree4, 'r1', name="child", parent_name="parent")

# draw to highlight new parent and child nodes
modtree4.draw('r', node_mask=modtree4.get_node_mask(5), node_colors="lightgrey");

Create connections among Node objects and create a ToyTree from them. You can do this by setting ._up, ._children, and ._dist values on a set of nodes.

In [38]:

Copied!





# create several tips nodes
nodeA = toytree.Node("A", dist=1)
nodeB = toytree.Node("B", dist=1)
nodeC = toytree.Node("C", dist=1)

# create several internal Nodes
nodeAB = toytree.Node("AB", dist=1)
nodeABC = toytree.Node("ABC", dist=1)

# connect the nodes
nodeA._up = nodeAB
nodeB._up = nodeAB
nodeC._up = nodeABC
nodeAB._up = nodeABC
nodeAB._children = (nodeA, nodeB)
nodeABC._children = (nodeAB, nodeC)

# draw the tree (the tree traversal data is cached at this step)
toytree.tree(nodeABC).draw(ts='r', node_colors="lightgrey");
# create several tips nodes
nodeA = toytree.Node("A", dist=1)
nodeB = toytree.Node("B", dist=1)
nodeC = toytree.Node("C", dist=1)

# create several internal Nodes
nodeAB = toytree.Node("AB", dist=1)
nodeABC = toytree.Node("ABC", dist=1)

# connect the nodes
nodeA._up = nodeAB
nodeB._up = nodeAB
nodeC._up = nodeABC
nodeAB._up = nodeABC
nodeAB._children = (nodeA, nodeB)
nodeABC._children = (nodeAB, nodeC)

# draw the tree (the tree traversal data is cached at this step)
toytree.tree(nodeABC).draw(ts='r', node_colors="lightgrey");

Similarly, this process could be applied to an existing tree to add or remove connections by changing the same types of node attributes. The important thing is that the ToyTree._update() function is called at the end to update values across connected nodes. The Node object includes convenience functions _add_child and _remove_child which change the ._up and ._children attributes together, but setting them manually may be more clear.

In [39]:

Copied!





# get a 4-tip balanced tree
tree4 = toytree.rtree.baltree(4, treeheight=2)

# add a new sister (internal and tip node) to tip node 0
tree4[0]._add_child(toytree.Node("child0", dist=1))
tree4[0]._add_child(toytree.Node("child1", dist=1))

# connects node data across three
tree4._update()

# draw to highlight new nodes. Note former node (idx=0, name='r0') is now node idx=5
tree4.draw('r', node_mask=tree4.get_node_mask(5), node_colors="lightgrey");
# get a 4-tip balanced tree
tree4 = toytree.rtree.baltree(4, treeheight=2)

# add a new sister (internal and tip node) to tip node 0
tree4[0]._add_child(toytree.Node("child0", dist=1))
tree4[0]._add_child(toytree.Node("child1", dist=1))

# connects node data across three
tree4._update()

# draw to highlight new nodes. Note former node (idx=0, name='r0') is now node idx=5
tree4.draw('r', node_mask=tree4.get_node_mask(5), node_colors="lightgrey");

	child	parent
0	0	8
1	1	8
2	2	9
3	3	10
4	4	10
5	5	11
6	6	11
7	7	12
8	8	9
9	9	14
10	10	13
11	11	12
12	12	13
13	13	14

	child	parent
0	0	8
1	1	8
2	2	9
3	3	10
4	4	10
5	5	11
6	6	11
7	7	12
8	8	9
9	9	14
10	10	13
11	11	12
12	12	13
13	13	14

	child	parent
0	0	8
1	1	8
2	2	9
3	3	10
4	4	10
5	5	11
6	6	11
7	7	12
8	8	9
9	9	14
10	10	13
11	11	12
12	12	13
13	13	14