Range-mapping¶

Range mapping is a convenient method and syntax that allows for transforming/projecting raw data values, which may be composed of very small or very large values, into a range of values that is more appropriate for visualization, such as pixel size units. This is especially useful when wanting to visualize a range of data values on a tree as a distribution of node sizes or edge widths. (See Color-Mapping for a similar implementation to project discrete or continuous data values to a color map.)

In [1]:

Copied!

import toytree
import numpy as np
import toytree
import numpy as np

Take Home

Range mapping allows you to easily project data values into an appropriate range of pixel sizes to enter as arguments to drawing functions. This is done by entering the data feature to be range-mapped as a tuple in the format: (feature, min_value, max_value, nan_value).

In [2]:

Copied!

# example: map node sizes to node idx values in range (12-25) px
tree = toytree.rtree.bdtree(10, seed=123)
tree.draw(ts='s', node_sizes=("idx", 12, 25));
# example: map node sizes to node idx values in range (12-25) px
tree = toytree.rtree.bdtree(10, seed=123)
tree.draw(ts='s', node_sizes=("idx", 12, 25));

Example Data¶

To demonstrate range-mapping we will use a 10 tip birth-death tree with three data features assigned to Nodes of the tree. The first feature "X" contains small float values randomly drawn from the interval (0-1). The second feature contains large float values randly drawn from the interval (100-1000). And the third feature contains the same data as feature "X", but with missing values for many nodes.

In [3]:

Copied!





tree = toytree.rtree.bdtree(10, seed=123)
rng = np.random.default_rng(seed=123)
tree.set_node_data("X", rng.uniform(0, 1, tree.nnodes), inplace=True);
tree.set_node_data("Y", rng.uniform(1e2, 1e3, tree.nnodes), inplace=True);
tree.set_node_data("Z", {i: i.X for i in tree[12:]}, inplace=True);
tree = toytree.rtree.bdtree(10, seed=123)
rng = np.random.default_rng(seed=123)
tree.set_node_data("X", rng.uniform(0, 1, tree.nnodes), inplace=True);
tree.set_node_data("Y", rng.uniform(1e2, 1e3, tree.nnodes), inplace=True);
tree.set_node_data("Z", {i: i.X for i in tree[12:]}, inplace=True);

In [4]:

Copied!

tree.get_node_data(["X", "Y", "Z"])
tree.get_node_data(["X", "Y", "Z"])

Out[4]:

	X	Y	Z
0	0.682352	566.348533	NaN
1	0.053821	308.400062	NaN
2	0.220360	249.313594	NaN
3	0.184372	548.010072	NaN
4	0.175906	624.452177	NaN
5	0.812095	265.904189	NaN
6	0.923345	113.405425	NaN
7	0.276574	524.019906	NaN
8	0.819755	755.418995	NaN
9	0.889893	926.740443	NaN
10	0.512970	662.980605	NaN
11	0.244965	925.410315	NaN
12	0.824242	878.221226	0.824242
13	0.213763	296.328586	0.213763
14	0.741467	879.514688	0.741467
15	0.629940	757.676743	0.629940
16	0.927407	350.078761	0.927407
17	0.231908	817.339198	0.231908
18	0.799125	878.699542	0.799125

Visualizing Raw Data¶

As we can see in the examples below, the features "X", "Y" and "Z" do not serve well as arguments to the draw function to designate size of node markers. Either the markers are too small to see or too large such that they obscure the entire plot. One solution to this problem would be to call get_node_data on the feature to extract the data and then either multiply or divide the values by a constant to transform them into a more reasonable pixel size range. While this can be done, range-mapping provides a more convenient solution, explained below.

In [5]:

Copied!

# raw "X" data values are too small to use for node_sizes
tree.draw(node_sizes="X");
# raw "X" data values are too small to use for node_sizes
tree.draw(node_sizes="X");

In [6]:

Copied!

# raw "Y" data values are too large to use for node_sizes
tree.draw(node_sizes="Y", height=350, width=350);
# raw "Y" data values are too large to use for node_sizes
tree.draw(node_sizes="Y", height=350, width=350);

Using Range Mapping¶

Range mapping allows you to project a set of values into a new range while still preserving the relative differences among values. For example, the data in feature "X" ranges from about 0.05 to 0.95, all of which is too small for visualization. Using range mapping we can project these values so that the smallest values is 5, the largest value is 15, and all intermediate values are projected to the appropriate relative position between these min and max values.

Tuple syntax¶

Range mapping is designated by using the tuple syntax (feature name, min_value, max_value, nan_value). The feature name is the only required argument, if the others are left empty then default values are assigned for the min, max, and nan values of (5, 20, 0).

In [7]:

Copied!

# project "X" values to pixel range using auto args for min,max,nan
tree.draw(node_sizes=("X",));
# project "X" values to pixel range using auto args for min,max,nan
tree.draw(node_sizes=("X",));

In [8]:

Copied!

# project very small "X" values to pixel range 5-15
tree.draw(node_sizes=("X", 5, 15));
# project very small "X" values to pixel range 5-15
tree.draw(node_sizes=("X", 5, 15));

In [9]:

Copied!

# project very large "Y" values to pixel range 5-15
tree.draw(node_sizes=("Y", 5, 15));
# project very large "Y" values to pixel range 5-15
tree.draw(node_sizes=("Y", 5, 15));

In [10]:

Copied!

# project very small "Z" values to pixel range 5-15, with NaN values set to 0
tree.draw(node_sizes=("Z", 5, 15, 0));
# project very small "Z" values to pixel range 5-15, with NaN values set to 0
tree.draw(node_sizes=("Z", 5, 15, 0));

Missing Data Values¶

When one or more Nodes do not contain a feature they are assigned a value of np.nan by default. When range-mapping is performed it can be instructed how to treat these values. The default treatment is to convert these values to 0, but, you can alternatively set any value you want.

In [11]:

Copied!

tree.draw(node_sizes=("Z", 5, 15, 10));
tree.draw(node_sizes=("Z", 5, 15, 10));

Discrete Data Values¶

Note that range-mapping can only be applied to continuous data, not discrete/categorical data. A more appropriate way to treat discrete data may be to use Color-Mapping instead.

Get Range Mapped Data¶

When you use the tuple format to instruct toytree to perform range mapping on a data feature it performs a simple operation under the hood to project the data into its new value range. This function is available to users in toytree.style.get_range_mapped_feature in case users wish to use it externally.

In [12]:

Copied!

toytree.style.get_range_mapped_feature(tree, "X", min_value=2, max_value=15, nan_value=0)
toytree.style.get_range_mapped_feature(tree, "X", min_value=2, max_value=15, nan_value=0)

Out[12]:

array([11.35328489,  2.        ,  4.47829579,  3.94275072,  3.81676794,
       13.2840094 , 14.93954874,  5.31483464, 13.39800011, 14.44173875,
        8.83268852,  4.84444335, 13.46477251,  4.38012595, 12.23298906,
       10.57333721, 15.        ,  4.65014844, 13.09100966])