A phylogenetic tree (PhT) with n leaves in this work is a binary tree whose leaves are distinctly labeled by the consecutive integers [n]= { 1, ... , n}, and the internal leaves are unlabeled. A binary tree with n leaves has n-2 internal vertices. A PhT-isomorphism is a tree isomorphism which keeps the labels of the leaves. A rooted binary tree is a binary tree, where one of the edges is split by a vertex which is labeled ROOT. There are (2n-5)!! different labeled binary trees and (2n-3)!! rooted labeled binary trees, on n leaves.
Each edge e divides the original tree into an unordered pair of induced subtrees uniquely. Both of these subtrees can have exactly one point of contraction, and that is at the appropriate endpoint of the edge e. We will denote the set of leaves of the one of the two induced subtrees which contains 1 by Se. We will call Se and [n] \ Se the clusters of the tree at e.
We will represent the edge e by the following bipartition of
[n]: (Se | [n] \ Se), and we call this
representation the split of
the tree at the edge e.
We use the notation ij | kl for the tree on four leaves i, j, k, l such that the leaves i, j are separated from the leaves j, k by an internal edge. We will also call ij | kl a quartet split or QS.
Given a binary tree T, ij || kl is a valid quartet split of T, if the reduced subtree generated on i, j, k, l is the tree ij | kl.
Given a finite set of quartet splits Q, we define the dyadic closure cl(Q) of Q as the set of quartet splits that can be inferred from Q by the repeated use of rules :
Let us be given an evolutionary tree T with the sequences of length k on the leaves 1, ... ,n over an a element alphabet. Let f(i,j)(alpha ,beta ) denote the probability that a given character in the sequence at leaf i is in state alpha and the same character in the sequence at leaf j is in state beta. Ordering the states, f(i,j)(alpha ,beta ) forms an (a x a) square matrix:
Any symmetric matrix, which has zeros in its diagonal and positive numbers
everywhere else, will be called a distance matrix.
An nxn distance matrix D(i,j) is called additive,
if there exists an n-leaf tree with positive edge weights on the
internal edges and non negative edge weights on the leaf edges, such that
D(ij) equals the sum of edge weights in the tree along the path
connecting i and j.
The matrix D(ij) satisfies the Four Point Condition if for all (not necessarily distinct) i, j, k, l the maximum of {D(ij)+D(kl), D(ik)+D(jl), D(il)+D(jk)} is not unique.
The concept of additive matrix and the Four Point condition is connected
by the following theorem of Buneman:
A matrix D is additive, if and only if it satisfies the Four
Point Condition. The edge weighted tree representing the additive distance
matrix is unique among the trees without vertices of degree two.