Thursday, December 13, 2012

phylodiv: an R function for calculating the phylogenetic diversity of ecological samples

Here's a function I have written for the R statistical environment that calculates the Phylogenetic Diversity (PD) of multiple samples. I am providing it for free and without warranty under the GNU General Public License. You need to be familiar with R to use this function. The function also requires that the ape package be installed. To load the function, place the file in your working folder and type ‘source(“phylodiv.R”)’.

Latest version: 13th December 2012. A couple of small changes to improve efficiency.

phylodiv (x, phy)
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.
phylodiv takes a community data table and a phylogenetic tree (rooted and with branch lengths) and calculates the Phylogenetic Diversity (PD) of all samples/sites. PD is defined as the total length of all branches spanning a set of terminal taxa representing an ecological sample (Faith, 1992). Please note that, if the common ancestor (node) of the set of taxa of a sample is not the root of the tree, then the set of branches connecting this node to the root are also included in the calculation. Calculations are achieved using the efficient matrix algebra solution of Rodrigues & Gaston (2002).
phylodiv returns a vector giving the Phylogenetic Diversity (PD) of each sample/site in x.
Faith DP. 1992. Conservation evaluation and phylogenetic diversity. Biological Conservation 61: 1-10.
Rodrigues A & Gaston KJ. 2002. Maximising phylogenetic diversity in the selection of networks of conservation areas. Biological Conservation 105: 103-111.