Range-mapping¶
Range mapping is a convenient method and syntax that allows for transforming/projecting raw data values, which may be composed of very small or very large values, into a range of values that is more appropriate for visualization, such as pixel size units. This is especially useful when wanting to visualize a range of data values on a tree as a distribution of node sizes or edge widths. (See Color-Mapping for a similar implementation to project discrete or continuous data values to a color map.)
import toytree
import numpy as np
Take Home
Range mapping allows you to easily project data values into an appropriate range of pixel sizes to enter as arguments to drawing functions. This is done by entering the data feature to be range-mapped as a tuple in the format: (feature, min_value, max_value, nan_value).
# example: map node sizes to node idx values in range (12-25) px
tree = toytree.rtree.bdtree(10, seed=123)
tree.draw(ts='s', node_sizes=("idx", 12, 25));
Example Data¶
To demonstrate range-mapping we will use a 10 tip birth-death tree with three data features assigned to Nodes of the tree. The first feature "X" contains small float values randomly drawn from the interval (0-1). The second feature contains large float values randly drawn from the interval (100-1000). And the third feature contains the same data as feature "X", but with missing values for many nodes.
tree = toytree.rtree.bdtree(10, seed=123)
rng = np.random.default_rng(seed=123)
tree.set_node_data("X", rng.uniform(0, 1, tree.nnodes), inplace=True);
tree.set_node_data("Y", rng.uniform(1e2, 1e3, tree.nnodes), inplace=True);
tree.set_node_data("Z", {i: i.X for i in tree[12:]}, inplace=True);
tree.get_node_data(["X", "Y", "Z"])
X | Y | Z | |
---|---|---|---|
0 | 0.682352 | 566.348533 | NaN |
1 | 0.053821 | 308.400062 | NaN |
2 | 0.220360 | 249.313594 | NaN |
3 | 0.184372 | 548.010072 | NaN |
4 | 0.175906 | 624.452177 | NaN |
5 | 0.812095 | 265.904189 | NaN |
6 | 0.923345 | 113.405425 | NaN |
7 | 0.276574 | 524.019906 | NaN |
8 | 0.819755 | 755.418995 | NaN |
9 | 0.889893 | 926.740443 | NaN |
10 | 0.512970 | 662.980605 | NaN |
11 | 0.244965 | 925.410315 | NaN |
12 | 0.824242 | 878.221226 | 0.824242 |
13 | 0.213763 | 296.328586 | 0.213763 |
14 | 0.741467 | 879.514688 | 0.741467 |
15 | 0.629940 | 757.676743 | 0.629940 |
16 | 0.927407 | 350.078761 | 0.927407 |
17 | 0.231908 | 817.339198 | 0.231908 |
18 | 0.799125 | 878.699542 | 0.799125 |
Visualizing Raw Data¶
As we can see in the examples below, the features "X", "Y" and "Z" do not serve well as arguments to the draw function to designate size of node markers. Either the markers are too small to see or too large such that they obscure the entire plot. One solution to this problem would be to call get_node_data
on the feature to extract the data and then either multiply or divide the values by a constant to transform them into a more reasonable pixel size range. While this can be done, range-mapping provides a more convenient solution, explained below.
# raw "X" data values are too small to use for node_sizes
tree.draw(node_sizes="X");
# raw "Y" data values are too large to use for node_sizes
tree.draw(node_sizes="Y", height=350, width=350);
Using Range Mapping¶
Range mapping allows you to project a set of values into a new range while still preserving the relative differences among values. For example, the data in feature "X" ranges from about 0.05 to 0.95, all of which is too small for visualization. Using range mapping we can project these values so that the smallest values is 5, the largest value is 15, and all intermediate values are projected to the appropriate relative position between these min and max values.
Tuple syntax¶
Range mapping is designated by using the tuple syntax (feature name, min_value, max_value, nan_value)
. The feature name is the only required argument, if the others are left empty then default values are assigned for the min, max, and nan values of (5, 20, 0).
# project "X" values to pixel range using auto args for min,max,nan
tree.draw(node_sizes=("X",));
# project very small "X" values to pixel range 5-15
tree.draw(node_sizes=("X", 5, 15));
# project very large "Y" values to pixel range 5-15
tree.draw(node_sizes=("Y", 5, 15));
# project very small "Z" values to pixel range 5-15, with NaN values set to 0
tree.draw(node_sizes=("Z", 5, 15, 0));
Missing Data Values¶
When one or more Nodes do not contain a feature they are assigned a value of np.nan
by default. When range-mapping is performed it can be instructed how to treat these values. The default treatment is to convert these values to 0, but, you can alternatively set any value you want.
tree.draw(node_sizes=("Z", 5, 15, 10));
Discrete Data Values¶
Note that range-mapping can only be applied to continuous data, not discrete/categorical data. A more appropriate way to treat discrete data may be to use Color-Mapping instead.
Get Range Mapped Data¶
When you use the tuple format to instruct toytree
to perform range mapping on a data feature it performs a simple operation under the hood to project the data into its new value range. This function is available to users in toytree.style.get_range_mapped_feature
in case users wish to use it externally.
toytree.style.get_range_mapped_feature(tree, "X", min_value=2, max_value=15, nan_value=0)
array([11.35328489, 2. , 4.47829579, 3.94275072, 3.81676794, 13.2840094 , 14.93954874, 5.31483464, 13.39800011, 14.44173875, 8.83268852, 4.84444335, 13.46477251, 4.38012595, 12.23298906, 10.57333721, 15. , 4.65014844, 13.09100966])