Testing for an evolutionary correlation between binary characters using Pagel's (1994) method


This final exercise is a short exercise designed to fit and test Pagel's (1994) method for testing for an evolutionary relationship between two binary characters.

To do this exercise, you will first need to download the following dataset & tree, although they can also be loaded directly from the web.

The dataset is here and the corresponding phylogenetic tree can be downloaded here.

Let's load phytools and then our data and tree:

library(phytools)
tree<-read.tree("http://www.phytools.org/Ilhabela2015/data/fitPagel-tree.tre")
tree
## 
## Phylogenetic tree with 300 tips and 299 internal nodes.
## 
## Tip labels:
##  t149, t150, t164, t165, t59, t60, ...
## 
## Rooted; includes branch lengths.
X<-read.csv("http://www.phytools.org/Ilhabela2015/data/fitPagel-data.csv",
    row.names=1,header=TRUE)
## X is a big matrix, so let's just visualize the top part
dim(X)
## [1] 300   3
head(X)
##      x y z
## t149 a a b
## t150 a b b
## t164 b a b
## t165 b b b
## t59  a a b
## t60  a a b

OK, having done this, we will use the phytools function fitPagel to fit Pagel's (1994) model. To get a sense of our data before we do, we should first visualize the data on the tree. Let's try this:

par(mfrow=c(1,2))
plot(tree,show.tip.label=FALSE,no.margin=TRUE)
par(fg="transparent")
x<-setNames(X[[1]],rownames(X))
tiplabels(pie=to.matrix(x,c("a","b")),piecol=c("blue","red"),
    cex=0.3)
plot(tree,show.tip.label=FALSE,no.margin=TRUE,direction="leftwards")
y<-setNames(X[[2]],rownames(X))
tiplabels(pie=to.matrix(x,c("a","b")),piecol=c("blue","red"),
    cex=0.3)

plot of chunk unnamed-chunk-2

par(fg="black")

OK, so we can repeat that also with the third column of X.

Now we're ready to run our model:

fit.xy<-fitPagel(tree,x,y)
fit.xy
## 
##   Pagel's binary character correlation test:
## 
## Indepedent model rate matrix:
##            a|a        a|b        b|a        b|b
## a|a -1.6338807  0.7929193  0.8409613  0.0000000
## a|b  0.4595753 -1.3005366  0.0000000  0.8409613
## b|a  0.5940199  0.0000000 -1.3869392  0.7929193
## b|b  0.0000000  0.5940199  0.4595753 -1.0535951
## 
## Dependent model rate matrix:
##            a|a        a|b        b|a        b|b
## a|a -1.0668090  0.6538049  0.4130040  0.0000000
## a|b  0.9094257 -4.6819222  0.0000000  3.7724965
## b|a  2.6064787  0.0000000 -4.2118579  1.6053792
## b|b  0.0000000  0.3624456  0.3440528 -0.7064984
## 
## Model fit:
##             log-likelihood
## independent      -231.6494
## dependent        -205.2088
## 
## Hypothesis test result:
##   likelihood-ratio:  52.88121 
##   p-value:  9.023704e-11
z<-setNames(X[[3]],rownames(X))
fit.xz<-fitPagel(tree,x,z)

Written by Liam J. Revell. Last updated 4 July 2015.