Fredrik Ronquist. Dept. Bioinformatics & Total-evidence dating. Simultaneous analysis of Codon-site-partitioned GTR+I+G: SYM+I+G. ▫ Non-clock, strict. Total-evidence dating (TED) allows evolutionary biologists to . The character data comprised discrete morphological characters and 36 DNA sites from Zhang C,; Stadler T,; Klopfstein S,; Heath TA,; Ronquist F. This example shows the complete tree (Figure 2A) and the sampled or reconstructed the so-called “total-evidence” approaches described by Ronquist et al. later date is correlated with the turnover rate ($r=\mu/\lambda$) and the fossil across sites (the Substitution Model and Sites Model in Figure 1).
Originally, phylogenies were dated by assuming a constant molecular clock, the rate of which could be estimated by reference to the fossil record Zuckerkandl and Pauling Since then, divergence time estimation has become much more sophisticated.
Numerous studies have shown that the rate of molecular evolution varies significantly over time and among lineages, and it is now standard practice to accommodate such rate variation using relaxed-clock models Drummond et al. The calibration of the trees has also improved considerably. Instead of relying on a single-point estimate of the clock rate, it is now common to use multiple calibration points derived from the fossil record, each of which is associated with a probability distribution summarizing the available information Yang and Rannala Increasingly, such complex data sets are being analyzed with Bayesian methods, which provide a unifying framework for accommodating multiple sources of uncertainty.
First, the calibration data must be associated with fixed nodes in the tree, despite the fact that we do not know any of the nodes with absolute certainty. This may result in artifacts in the dating analysis, such as exaggerated confidence in the tree topology and the resulting age estimates.
To avoid constraining the tree, one can attach the calibration information to the most recent common ancestor of some named terminal taxa instead. If there is topological uncertainty, however, this results in the calibration information floating around in the tree in a manner that is unlikely to reflect the uncertainty in the placement of the calibration fossil. Second, node dating only extracts calibration information from the oldest fossil assigned to a particular group, as younger fossils from the same group do not provide any additional information on the minimum age of the calibrated node.
Moreover, many of the more poorly preserved fossils are excluded from the analysis from the outset because their placement cannot be inferred with sufficient certainty.
For node dating, one thus often ends up discarding most of the information preserved in the fossil record but see Marshall Third, the raw data from the fossil record—the ages of the fossils and their morphology—must be translated into appropriate probability distributions for the ages of the calibrated nodes, a process that is not straightforward Parham et al.
Even if the phylogenetic position of a fossil can be determined beyond any reasonable doubt, it is likely to sit on a side branch of some unknown length rather than directly on the calibration node itself. Thus, the fossil only provides a minimum age, and it remains unclear how the information available in the morphological characters about the period between the calibration point and the formation of the fossil can be translated into a probability distribution for the age of the calibrated node.
Thus, it is difficult to design these probability distributions properly, even though it has been shown that they often have a huge influence on the analysis, resulting in divergence time estimates that can vary by hundreds of million years Warnock et al.
Another possibility is to use cross- validation techniques to identify and remove inconsistent calibration nodes Near and Sanderson ; Near et al. Nevertheless, node dating still relies heavily on indirect ad hoc translation of the fossil record into appropriate calibration points.
A more satisfactory way of addressing fossil affinities is to treat the actual character evidence in a phylogenetic context. Several studies have analyzed fossil and recent taxa together, using combined morphological and molecular data, to study the placement of the fossils and their impact on the topology estimates for the recent taxa Lee et al.
However, these studies were not intended to result in calibrated trees, or if they were, they only used the inferred placements of the fossils to inform a classical node-dating approach.
Although the fossil placement and minimum calibration constraints on the tree were thus improved, these approaches could not avoid the largely arbitrary assignment of a probability distribution to the calibration points.
It uses morphological data to infer fossil placement, like some previous studies, but it also calibrates the tree at the same time. Unlike node dating, total-evidence dating can easily be applied to rich sets of fossils without fixing any nodes in the tree.
It relies on the morphological similarity between a fossil and the reconstructed ancestors in the extant tree in assessing the likely length of any extinct side branch on which the fossil sits. Advances in sequencing technologies coupled with new, statistically rigorous inference methods have greatly enhanced our ability to investigate phylogenetic relationships, but this is only the first step toward a dated phylogeny.
Molecular data only provide evolutionary distances in units of evolutionary change, such as substitutions per site. Branch lengths measured in this way are the product of the geological time duration e.
To estimate rates and times separately on a relative scaleit is necessary to introduce additional model assumptions that account for branch-rate variation across the tree and the distribution of speciation events over time. Early studies achieved this by considering the evolutionary rate to be constant over time, that is, assuming a global molecular clock also called a strict clock; Zuckerkandl and Pauling More recent methods allow the rate to vary over time under constraints specified by a relaxed-clock model, typically using a Bayesian inference framework Hasegawa et al.
Regardless of whether a strict or a relaxed-clock model is used, the result is an estimate of species divergence times on a relative scale. To convert the relative times into absolute times, it has been customary to rely on user-specified internal nodes that are calibrated using additional information, typically from biogeographic events or from the fossil record.
Using one or more such calibration nodes, it is possible to estimate the ages of all other nodes in the tree. Compared with node dating methods, the total-evidence approach introduces a number of innovations in terms of how the fossil information is incorporated in the analysis.
Specifically, homologous morphological characters are coded for fossil and extant taxa and included in a combined matrix. Age estimates are assigned to individual fossils based on the dating of the strata in which they are found. These data, together with molecular sequences sampled from extant taxa, are analyzed in an integrative framework to directly inform the inference of divergence times, while accounting for uncertainty in the placement of the fossils in the phylogeny.
One of the essential strengths of the total-evidence dating approach is that it allows the probabilistic model to be expanded to include additional sources of information that could be important in dating but have not been modeled explicitly before. Here, we exploit this to address the fossilization process and the sampling procedure, both of which potentially have a major impact on divergence time estimates.
This has to be done in the context of a tree model capable of accommodating speciation, extinction, fossilization, and sampling. The early implementations of total-evidence dating Lee et al.
The standard birth—death model used in phylogenetics, that is, the constant-rate reconstructed birth—death process, assumes that birth and death rates or speciation and extinction rates are constant over time and that no individuals fossils are sampled in the past Kendall ; Nee et al.
Stadler a extended this process to account for serially sampled lineages also see Didier et al. The FBD process prior simultaneously models the speciation and extinction patterns and observations of fossils in a birth—death macroevolutionary framework. Their approach is more similar to node dating than total-evidence dating in that fossils are associated with specific clades a priori, and the lengths of the fossil branches are only informed by the FBD prior, not by character data.
Recently, Gavryushkina et al. So far, work on the FBD process has assumed either complete or uniformly random sampling of fossils and extant taxa.
Although this is convenient from a mathematical perspective, it potentially ignores an important bias in the data resulting from the common non-random choice of terminals for dating analyses.