A program for counting leaf-multi-labeled trees
The formulas used for these computations are described in
É. Czabarka,
P.L. Erdős,
V. Johnson,
V. Moulton:
Generating functions
for multi-labeled trees
Discrete Applied Mathematics 161 (2013) 107-117
and in
Virginia Johnson's Ph.D. thesis, (Aug. 2012)
Enumeration Results on Leaf-Labeled Trees.
Some background
In evolutionary biology, it is common practice to
use leaf-labeled (or phylogenetic) trees
to represent the evolution of
species, populations, organisms, and the like.
Technically speaking, such a tree is a simple, connected
graph with no cycles, and it is leaf-labeled in case
each of its leaves (i.e. vertices of degree 1)
is labeled by precisely one element from some set.
The set of labels corresponds to the set of
species, populations or organisms under consideration.
It has become apparent that
it can also be useful to employ a slightly more
general type of tree when trying to understand, for example,
gene evolution. In particular, due to
processes such as gene (or genome) duplication
or lateral gene transfer,
trees can often arise in which more than one leaf is
labeled by the same element of the label set.
We call such trees leaf-multi-labeled trees.
Note that in case each leaf if labeled only once,
a leaf-multi-labeled tree is just a leaf-labeled tree.
In addition to arising in the
study of gene versus species evolution,
leaf-multi-labeled trees have been used to construct phylogenetic networks,
and they naturally arise in biogeography.
The code and some caviats
Code for binary trees
and code for non-binary trees
are available by clicking on the appropriate links; and can be run using SageMath. This code is old:
when it was written, the scripting language for Sage was Python 2.
Currently the scripting language is Python 3.
(e.g. what used to be "print var" is now "print(var)")
There is a Python 2 to Python 3 converter online:
https://python2to3.com/
I do not know how well it
works.
Output description
The function T(n,k) (where n is the number of leaves and k is the number of labels)
returns six lists of counts, each containing the number of trees using at most k labels on 0,1,2,...,n leaves.
The first and second lists count rooted leaf-multi-labeled binary trees, the difference is
that the second list gives the count for those trees that use all k labels, while the first
gives the count that do not use more than k labels. The third list is for marked
leaf-multi-labeled binary trees, the fourth for marked leaf-multi-labeled binary trees
which use all k labels, the fifth is for unrooted leaf-multi-labeled binary trees, and
the sixth is for unrooted leaf-labeled binary trees which use all k labels.
For example for the call T(10,5) the program returns:
Number of leaves,number of labels
10 5
Rooted MUL Binary Trees
[0, 5, 15, 75, 495, 3600, 28275, 232500, 1979385, 17287050, 154041450]
Rooted MUL Binary Trees using all k labels
[0, 0, 0, 0, 0, 105, 2625, 42075, 554820, 6578550, 73169250]
Marked MUL Binary Trees
[0, 5, 25, 110, 600, 4200, 31730, 255750, 2144400, 18530875, 163771800]
Marked MUL Binary Trees using all k labels
[0, 0, 0, 0, 0, 120, 2925, 46065, 599670, 7041285, 77720100]
Unrooted MUL Binary Trees
[0, 5, 15, 35, 120, 600, 3530, 23250, 165510, 1243825, 9733950]
Unrooted MUL Binary Trees using all k labels
[0, 0, 0, 0, 0, 15, 300, 3990, 44850, 462735, 4550955]
This output tells you that for example there are 28275 rooted leaf-multi-labeled binary trees
on 6 leaves that use at most 5 labels and 2625 of these use all 5 labels, and there are
9733950 unrooted leaf-multi-labeled binary trees on 10 leaves that use at most 5 labels.
The function G(n,k) (where n is the number of leaves and k is the number of labels) returns two lists of counts, each containing the number of trees using at most k labels on 0,1,2,...,n leaves. The first list counts rooted non-binary leaf-multi-labeled trees, labeled with at most k labels. The second list is for unrooted leaf-multi-labeled non-binary trees, labeled with at most k labels.
For example for the call G(9, 5) the program returns:
Rooted Non-binary Multi-leaf-labeled Trees
[0, 5, 15, 110, 965, 9376, 97775, 1068450, 12081605, 140160650]
Unrooted Non-binary Multi-leaf-labeled Trees
[0, 5, 65, 335, 2840, 27151, 279465, 3028655, 34035550, 393044405]
This output tells you that for example there are 97,775 rooted leaf-multi-labeled non-binary trees on 6 leaves that use at most 5 labels and there are 34,035,550 unrooted leaf-multi-labeled non-binary trees on 8 leaves that use at most 5 labels.
Return to home page