tag:blogger.com,1999:blog-84840270449987624612024-03-14T16:58:09.154+11:00David Nipperess - Research and TeachingDavid Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-8484027044998762461.post-41947964265030042362016-01-22T16:06:00.000+11:002016-01-22T16:06:08.092+11:00PDcalc: an implementation of the Phylogenetic Diversity (PD) calculus in R<span style="font-family: inherit;">I have started putting my various PD functions together in an R package (<i>PDcalc</i>). You can find a development version of the package here:</span><br />
<span style="font-family: inherit;"><br /></span>
<a href="https://github.com/davidnipperess/PDcalc">https://github.com/davidnipperess/PDcalc</a><br />
<br />
I will keep all current versions of the functions available for download from this site for as long as I can but they will not be updated. All future development will be for the R package.<br />
<br />
Thanks to everyone who has been using (and providing feedback on) my functions. I hope you find <i>PDcalc</i> as least as useful.<br />
<br />
DavidDavid Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-32460863638903436322016-01-08T11:30:00.000+11:002016-01-08T11:37:50.188+11:00phylocurve: an R function for generating a rarefaction curve of Phylogenetic Diversity<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates a rarefaction curve giving expected phylogenetic diversity (mean and variance) for multiple values of sampling effort. Sampling effort can be defined in terms of the number of individuals, sites or species. Expected phylogenetic diversity is calculated using an exact analytical formulation (Nipperess & Matsen 2013) that is both more accurate and more computationally efficient than randomisation methods. I am providing it for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“phylocurve.R”)’.</div>
<div style="font-family: Helvetica; font-size: 12px; font-variant: normal; line-height: normal; margin: 0px; min-height: 14px;">
<div style="font-style: normal; font-weight: normal;">
<br /></div>
<i style="font-weight: normal;">UPDATE (8th Jan 2016):</i> I have finally implemented the <i><b>exact solution for the variance</b></i> of PD under rarefaction! A special acknowledgement to <a href="https://sites.google.com/site/mazelflorent/" style="font-weight: normal;" target="_blank">Florent Mazel</a> for sharing his code with me. The function could probably be more efficient but it does the job and is still substantially faster (and more precise) than Monte Carlo subsampling.<br />
<div style="font-style: normal; font-weight: normal;">
<br /></div>
</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>UPDATE (21st Jan 2014)</i>: When using sampling without replacement (classic rarefaction), older versions of this function would not work with datasets that have a large number of objects (individuals/sites/species) to be rarefied. This is because phylocurve must calculate the number of possible combinations of each subset of objects in the total and this can be a very large number. It is possible, therefore, to exceed the largest number that R can handle. If that occurred, no warning would be issued but the output would include NAs instead of expected values of Phylogenetic Diversity. <i><b>I have fixed this problem such that phylocurve should now be able to handle very large numbers</b></i> (I had to re-learn some high school maths but I got there eventually). It might still be possible to exceed the limits of R but you should have a lot more headroom. If you do still encounter this problem, your options at this point are to: 1) sample with replacement (replace=TRUE); 2) try using a more powerful computer (although this is unlikely to make much difference); or 3) use the much slower phylocurve.perm function instead.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylocurve (x, phy, stepm=1, subsampling = “individual”, replace = FALSE)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font-family: Helvetica; font-size: 11px; font-variant: normal; font-weight: normal; line-height: normal; margin: 0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1). <i>Column labels must match tip labels in the phylogenetic tree exactly!</i></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">stepm</span> is the size of the interval in a sequence of numbers of individuals, sites or species to which x is to be rarefied.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">subsampling</span> indicates whether the subsampling will be by “individual” (default), “site” or “species”. When there are multiple sites, rarefaction by individuals or species is done by first pooling the sites.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">replace</span> is a boolean indicating whether subsampling should be done with or without replacement.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 13.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">phylocurve</span> takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates expected mean and variance of Phylogenetic Diversity (PD) for every specified value of <span style="font: normal normal normal 11px/normal Courier;">m</span> individuals, sites or species. <span style="font: normal normal normal 11px/normal Courier;">m</span> will range from 1 to the total number of individuals/sites/species in increments given by <span style="font: normal normal normal 11px/normal Courier;">stepm</span>. Calculations are done using the exact analytical formulae (Nipperess & Matsen, 2013) generalised from the classic equation of Hurlbert (1971). When there are multiple sites in the community table and rarefaction is by individuals or species, sites are first pooled.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">phylocurve</span> returns a matrix object of three columns giving the expected PD values (mean and variance) for each value of <span style="font: normal normal normal 11px/normal Courier;">m</span>.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Hurlbert (1971) The nonconcept of Species Diversity: a critique and alternative parameters. <i>Ecology</i> 52: 577-586.<br />
Nipperess & Matsen (2013) The mean and variance of phylogenetic diversity under rarefaction. <i>Methods in Ecology & Evolution 4: </i>566-572<i>. </i>Manuscript also available from <a href="http://arxiv.org/abs/1208.6552" target="_blank">ArXiv.org</a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropboxusercontent.com/u/89646921/R%20Functions/phylocurve/phylocurve.R?dl=1" target="_blank">>> Download phylocurve.R</a></div>
David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-87896076572411133612015-03-16T14:22:00.000+11:002015-03-16T14:57:42.102+11:00Data management and manipulation in RHere's the materials I used for teaching a module on R for the <a href="http://www.gg.mq.edu.au/" target="_blank">Genes to Geoscience</a> Research Enrichment Program.<br />
<h2>
Files for my R module</h2>
<h3>
<a href="https://dl.dropboxusercontent.com/u/89646921/R%20module%202015/Data%20manipulation%20complete.zip" target="_blank"> Zip archive</a></h3>
<div>
</div>
David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-52435216125894892552014-01-21T12:30:00.001+11:002016-01-08T11:37:16.988+11:00phylorare: an R function for calculating the rarefied Phylogenetic Diversity of ecological samples<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates expected phylogenetic diversity (under rarefaction) of multiple samples. The function uses an exact analytical formula (Nipperess & Matsen 2013). I am providing it for free and without warranty under the <a href="http://www.r-project.org/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“phylorare.R”)’.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br />
<i>UPDATE (21st Jan 2014)</i>: When using sampling without replacement (classic rarefaction), older versions of this function would not work with datasets that have a large number of objects (individuals/sites/species) to be rarefied. This is because phylorare must calculate the number of possible combinations of a subset of objects in the total and this can be a very large number. It is possible, therefore, to exceed the largest number that R can handle. If that occurred, no warning would be issued but the output would include NAs instead of expected values of Phylogenetic Diversity. <i><b>I have fixed this problem such that phylocurve should now be able to handle very large numbers</b></i> (I had to re-learn some high school maths but I got there eventually). It might still be possible to exceed the limits of R but you should have a lot more headroom. If you do still encounter this problem, your options at this point are to: 1) sample with replacement (replace=T); 2) try using a more powerful computer (although this is unlikely to make much difference); or 3) use a much slower algorithmic solution (such as phylocurve.perm).<br />
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylorare (x, phy, m=1, subsampling = “individual”, replace =F)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">m</span> is the number of individuals, sites or species to which <span style="font: normal normal normal 11px/normal Courier;">x</span> is to be rarefied.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">subsampling</span> indicates whether the subsampling will be by “individual” (default), “site” or “species”. When there are multiple sites, rarefaction by individuals or species is done separately for each site.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">replace</span> is a boolean indicating whether subsampling should be done with or without replacement.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 13.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">phylorare</span> takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates expected Phylogenetic Diversity (PD) for a given sample size of <span style="font: normal normal normal 11px/normal Courier;">m</span> individuals, sites or species. Calculations are done using an exact analytical formula generalised from the classic equation of Hurlbert (1971). When there are multiple sites in the community table and rarefaction is either individual or species-based, rarefied PD values are calculated for each site separately. In this case, all PD values will include the root of <span style="font: normal normal normal 11px/normal Courier;">phy</span> even if all the species in a site share a more recent common ancestor.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylorare returns a vector object giving the expected PD values of all sample/sites in x.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Hurlbert. 1971. The nonconcept of Species Diversity: a critique and alternative parameters. <i>Ecology</i> 52: 577-586.<br />
Nipperess & Matsen. 2013. The mean and variance of phylogenetic diversity under rarefaction. <i>Methods in Ecology & Evolution 4: </i>566-572<i>. </i>Manuscript also available from <a href="http://arxiv.org/abs/1208.6552" target="_blank">ArXiv.org</a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropboxusercontent.com/u/89646921/R%20Functions/phylorare.R?dl=1">>> Download phylorare.R</a></div>
David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-71974290713329633042012-12-13T14:16:00.000+11:002014-01-21T12:40:13.202+11:00phylodiv: an R function for calculating the phylogenetic diversity of ecological samples<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates the Phylogenetic Diversity (PD) of multiple samples. I am providing it for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“phylodiv.R”)’.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>Latest version:</i> 13th December 2012. A couple of small changes to improve efficiency.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylodiv (x, phy)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylodiv takes a community data table and a phylogenetic tree (rooted and with branch lengths) and calculates the Phylogenetic Diversity (PD) of all samples/sites. PD is defined as the total length of all branches spanning a set of terminal taxa representing an ecological sample (Faith, 1992). Please note that, if the common ancestor (node) of the set of taxa of a sample is not the root of the tree, then the set of branches connecting this node to the root are also included in the calculation. Calculations are achieved using the efficient matrix algebra solution of Rodrigues & Gaston (2002).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylodiv returns a vector giving the Phylogenetic Diversity (PD) of each sample/site in x.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Faith DP. 1992. Conservation evaluation and phylogenetic diversity. <i>Biological Conservation</i> 61: 1-10.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Rodrigues A & Gaston KJ. 2002. Maximising phylogenetic diversity in the selection of networks of conservation areas. <i>Biological Conservation</i> 105: 103-111.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropbox.com/u/89646921/R%20Functions/Phylodiv/phylodiv.R?dl=1">>> Download phylodiv.R</a></div>
David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-68580500297571679942012-09-04T11:07:00.003+10:002012-09-04T11:07:55.106+10:00The mean and variance of phylogenetic diversity under rarefactionErick Matsen and I have just submitted a manuscript to <i><a href="http://www.methodsinecologyandevolution.org/" target="_blank">Methods in Ecology and Evolution</a></i> on the rarefaction of phylogenetic diversity. The manuscript is available to view on <a href="http://arxiv.org/abs/1208.6552" target="_blank">ArXiv.org</a>.David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-47124458896237551612012-07-05T14:41:00.000+10:002012-07-05T14:45:45.618+10:00phylocurve.perm: an R function for generating a rarefaction curve of Phylogenetic Diversity by randomisation<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates a rarefaction curve giving expected Phylogenetic Diversity (PD) for multiple values of sampling effort. Sampling effort can be defined in terms of the number of individuals, sites or species. This function is a variant of “phylocurve” that uses randomisation rather than an exact analytical formula to calculate an estimate of expected PD. In addition, this version allows the calculation of estimates for the standard deviation, 95% confidence limits and minimum and maximum values of PD. The accuracy of these estimates is of course dependent on the number of permutations. I am providing it for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“phylocurve_perm.R”)’.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
WARNING: this function is <i>much</i> slower (about 10x to 100x) than “phylocurve”. For large datasets, I recommend testing with relatively large values of stepm and small values of perm to gauge the amount of computer time required.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>Latest version:</i> 17th August 2011. This version fixes a bug where rarefaction values were calculated incorrectly with community data tables containing species with total abundances of 0.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylocurve.perm (x, phy, stepm=1, subsampling = “individual”, replacement = F, perm=1000)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">stepm</span> is the size of the interval in a sequence of numbers of individuals, sites or species to which x is to be rarefied.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">subsampling</span> indicates whether the subsampling will be by “individual” (default), “site” or “species”. When there are multiple sites, rarefaction by individuals or species is done by first pooling the sites.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">replacement</span> is a logical indicating whether subsampling should be done with or without replacement.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">perm</span> is the number of iterations of the subsampling routine used to calculate expected PD.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 13.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">phylocurve.perm</span> takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates expected Phylogenetic Diversity (PD) for every specified value of <span style="font: normal normal normal 11px/normal Courier;">m</span> individuals, sites or species. <span style="font: normal normal normal 11px/normal Courier;">m</span> will range from 1 to the total number of individuals/sites/species in increments given by <span style="font: normal normal normal 11px/normal Courier;">stepm</span>. Estimates of expected PD as well as variance about that mean are determined by monte carlo randomisations (see Gotelli and Colwell 2001 for a more detailed explanation of the procedure as applied to species richness). When there are multiple sites in the community table and rarefaction is by individuals or species, sites are first pooled.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">phylocurve.perm</span> returns a matrix object of seven columns with each row corresponding to a particular value of <span style="font: normal normal normal 11px/normal Courier;">m</span>. The columns are: 1) the values of <span style="font: normal normal normal 11px/normal Courier;">m</span>; 2) the mean (expected) values of PD for all randomisations; 3) the standard deviation of PD for all randomisations; 4) the 0.025 quantile (lower 95% confidence limit); 5) the 0.975 quantile (upper 95% confidence limit); 6) the minimum PD; and 7) the maximum PD.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Gotelli and Colwell. 2001. Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. <i>Ecology Letters</i> 4: 379–391.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropbox.com/u/89646921/R%20Functions/phylocurve_perm.R?dl=1">>> Download phylocurve_perm.R</a></div>David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-56878775880544044282012-07-05T14:31:00.002+10:002012-07-05T14:44:47.693+10:00pd.resemble: an R function for calculating the pairwise resemblance in phylogenetic diversity of ecological samples<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates the pairwise resemblance (ie. either similarity or dissimilarity) in Phylogenetic Diversity of multiple samples. I am providing it for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“pdresemble.R”)’.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>Latest version:</i> 7th March 2011.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>Please note that this function was previously called “phylosim”. However, an R package now has that name, so I changed the name of my function. This version of the function also allows for the calculation of either similarity or dissimilarity while previous versions only calculated similarity, although conversion between the two is trivial.</i></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
pd.resemble (x, phy, incidence = T, method = “sorensen”, dissim=T)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
incidence is a logical indicating whether the data are to be treated as incidence (binary presence-absence) or abundance.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
method indicates the particular form of the resemblance index you wish to use. Current options are: "sorensen" (default - 2a/a+b+c), "jaccard" (a/a+b+c), "simpson" (a/a+min{b,c}) and "faith" (a+0.5d/a+b+c+d).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<span style="font: normal normal normal 11px/normal Courier;">dissim</span> is a logical indicating whether the pairwise resemblance scores should be similarity or dissimilarity (default).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
pd.resemble takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates the resemblance in Phylogenetic Diversity (PD-resemblance) of all pairwise combinations of samples/sites. The principles for calculating PD-resemblance on incidence data are discussed by Ferrier et al. (2007). I have extended this approach to include abundance data (Nipperess et al. 2010).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
pd.resemble returns a dist object giving the PD-resemblance of all pairwise combinations of sample/sites in x.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Ferrier S, Manion G, Elith J & Richardson K. 2007. Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. <i>Diversity & Distributions</i> 13: 252-264.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Nipperess DA, Faith DP & Barton K. 2010. Resemblance in phylogenetic diversity among ecological assemblages. <i>Journal of Vegetation Science</i> 21: 809-820.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropbox.com/u/89646921/R%20Functions/pdresemble.R?dl=1">>> Download pdresemble.R</a></div>David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-11422955177606931712012-07-05T14:22:00.003+10:002012-07-05T14:43:56.807+10:00phylo.endemism: an R function for calculating phylogenetic endemism of ecological samples<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that calculates phylogenetic endemism of multiple samples. I am providing it for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function. The function also requires that the <a href="http://cran.r-project.org/web/packages/ape/index.html">ape package</a> be installed. To load the function, place the file in your working folder and type ‘source(“phyloendemism.R”)’.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<i>Latest version:</i> 2nd December 2010.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylo.endemism (x, phy, weighted = T)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data can be either abundance or incidence (0/1).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phy is a rooted phylogenetic tree with branch lengths stored as a phylo object (as in the ape package) with terminal nodes labelled with names matching those of the community data table. Note that the function trims away any terminal taxa not present in the community data table, so it is not necessary to do this beforehand.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
weighted is a logical indicating whether weighted endemism (default) or strict endemism should be calculated.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylo.endemism takes a community data table and a rooted phylogenetic tree (with branch lengths) and calculates either strict or weighted endemism in Phylogenetic Diversity (PD). Strict endemism equates to the total amount of branch length found only in the sample/site and is described by Faith et al. (2004) as <i>PD-endemism</i>. Weighted endemism calculates the "spatial uniqueness" of each branch in the tree by taking the inverse of its range, multiplying by branch length and summing for all branch lengths present at a sample/site. Range is calculated simply as the total number of samples/sites at which the branch is present. This latter approach is described by Rosauer et al. (2009) as <i>Phylogenetic endemism</i>.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
phylo.endemism returns a vector object giving the phylogenetic endemism of all sample/sites in x.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Faith DP, Reid CAM & Hunter J. 2004. Integrating phylogenetic diversity, complementarity, and endemism for conservation assessment. <i>Conservation Biology</i> 18(1): 255-261.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Rosauer D, Laffan SW, Crisp MD, Donnellan SC & Cook LG. 2009. Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. <i>Molecular Ecology</i> 18(19): 4061-4072</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropbox.com/u/89646921/R%20Functions/phyloendemism.R?dl=1">>> Download phyloendemism.R</a></div>David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-13373976630640023222012-07-05T12:14:00.000+10:002012-07-05T12:22:46.105+10:00addbinary: an R function for converting an abundance table (x) into its additive binary form<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="http://2.bp.blogspot.com/-Xf4capBxGdc/T_T3VMLDBnI/AAAAAAAAABo/ILQDpY-nUCk/s1600/Rlogo-small.png" /></a></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
Here's a function I have written for the <a href="http://www.r-project.org/">R statistical environment</a> that converts a table of species abundances into its additive binary form. That is, each column of abundance data (representing a single species across multiple sites) is converted into a number of columns equal to the maximum abundance of that species, with each new column representing an abundance value ranging from 1 to maximum abundance. Species abundance per site is recorded as a "1" in each column for which the abundance value is equal to or less than the site abundance.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
If that makes no sense, let me attempt a simple example below with an abundance vector for a single species:</div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px;">
0 --> 000</div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px;">
3 --> 111</div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px;">
1 --> 100</div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px;">
0 --> 000</div>
<div style="font: 12.0px Courier; margin: 0.0px 0.0px 0.0px 0.0px;">
2 --> 110</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
The purpose for doing this, at least from my point of view, is to allow for the calculation of abundance-based forms of resemblance (similarity/dissimilarity) measures with R functions that only calculate the incidence-based, or binary, form. Tamas et al. (2001) have shown that any incidence-based resemblance measure calculated from the 2 x 2 contingency table can be transformed into its abundance-based equivalent by this method. So, for example, functions available in the simba and GDM packages can be "fooled" into accepting abundance-weighted data for calculating resemblance measures. Hopefully, this will be useful to somebody other than me.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
I am providing this function for free and without warranty under the <a href="http://www.gnu.org/licenses/">GNU General Public License</a>. You need to be familiar with R to use this function.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px; min-height: 14.0px;">
<br /></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Usage</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
addbinary (x)</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Arguments</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
x is a community data table (as in the vegan package) with species/OTUs as columns and samples/sites as rows. Columns are labelled with the names of the species/OTUs. Rows are labelled with the names of the samples/sites. Data are abundances of species/OTUs per sample/site.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Details</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
addbinary converts a community data table to its additive binary form (see discussion above).</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>Value</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
addbinary returns a matrix object being the additive binary form of x.</div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<b>References</b></div>
<div style="font: 11.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
Tamas et al. (2001) An extension of presence/absence coefficients to abundance data: a new look at absence. Journal of Vegetation Science 12: 401-410.</div>
<div style="font: 12.0px Helvetica; margin: 0.0px 0.0px 0.0px 0.0px;">
<a href="https://dl.dropbox.com/u/89646921/R%20Functions/addbinary.R?dl=1">>> Download addbinary.R</a></div>David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com0tag:blogger.com,1999:blog-8484027044998762461.post-48914280563960722602012-06-14T11:44:00.001+10:002012-06-14T12:19:59.951+10:00Welcome to my new websiteThis website will replace my old .mac site which will be shut down at the end of the month. The goal of this new website is to generally promote my research and teaching activities and provide a place where you can download my software.David Nipperesshttp://www.blogger.com/profile/07546966746014882136noreply@blogger.com