Maximum likelihood

build a likelihood function from first principles
use the function to plot and explore a likelihood surface
maximize likelihood
model selection

Scenario: you are studying a plant. You measure the average length of a leaf in your plant before and after a hurricane.

Model: Assume that the difference in trait values between the two measurements follows a normal distribution with mean zero and some unknown variance \(\sigma ^2\).

Measurement before = 5.3 mm, after is 7.4 mm.

creating a likelihood function

We want P(D|H). We have data, the trait difference 7.4 - 5.3 = 2.1 mm. Let’s use the normal distribution function to calculate a likelihood.

# denote our unknown parameter sigma^2 as x
nlik <- function(x, data=2.1) {
  1/sqrt(2 * pi * x) * exp(-data^2/(2*x))
}

nlik(1)

## [1] 0.0439836

nlik(2000)

## [1] 0.008910791

dnorm(2.1, mean=0, sd=1)

## [1] 0.0439836

Maximize the likelihood - brute force way

Run the function across a range of values and find the maximum.

testVals <- seq(0.5, 10, by=0.1)
testLikelihoods <- numeric(length(testVals))

for(i in 1:length(testVals)) {
  testLikelihoods[i]<-nlik(testVals[i])
}

plot(testVals, testLikelihoods, type="l")

testVals <- seq(4, 5, by=0.01)
testLikelihoods <- numeric(length(testVals))

for(i in 1:length(testVals)) {
  testLikelihoods[i]<-nlik(testVals[i])
}

plot(testVals, testLikelihoods, type="l")

# what is my maximm?u
index<-which(testLikelihoods==max(testLikelihoods))
testLikelihoods[index] # this is the maximum of the likelihood

## [1] 0.1152242

testVals[index] # this is my MLE

## [1] 4.41

use an algorithm to find the peak on the likelihood surface

my_guess <- 1.0
eps <- 0.1

for(i in 1:1000) {
  leftGuess <- my_guess - runif(1, min=0, max=eps)
  rightGuess <- my_guess + runif(1, min=0, max=eps)
  
  leftL <- nlik(leftGuess)
  currentL <- nlik(my_guess)
  rightL <- nlik(rightGuess)
  
  if(leftL > currentL) {
    my_guess <- leftGuess
  } else if(rightL > currentL) {
    my_guess <- rightGuess
  }

  if(i%%100 == 0) cat("Current guess: ", my_guess, "\n")  
}

## Current guess:  4.407499 
## Current guess:  4.409618 
## Current guess:  4.410108 
## Current guess:  4.410108 
## Current guess:  4.410108 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952

for(i in 1:1000) {
  leftGuess <- my_guess - runif(1, min=0, max=eps)
  rightGuess <- my_guess + runif(1, min=0, max=eps)
  
  leftL <- nlik(leftGuess)
  currentL <- nlik(my_guess)
  rightL <- nlik(rightGuess)
  
  if(leftL > currentL) {
    my_guess <- leftGuess
  } else if(rightL > currentL) {
    my_guess <- rightGuess
  }

  if(i%%100 == 0) cat("Current guess: ", my_guess, "\n")  
}

## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.409952 
## Current guess:  4.410043

Let R do all the work

In general R has a function called optim that climbs peaks and finds minimums - but we can make it work to maximize too.

In this case we have only one parameter so we can use optimize which is set up for this one-D case.

res <- optimize(nlik, interval=c(1, 5), maximum=T)
res

## $maximum
## [1] 4.409998
## 
## $objective
## [1] 0.1152242

# who won?
res$objective - nlik(my_guess)

## [1] 2.703116e-12

Now you do it.

Redo our experiment and get the following results.

Leaf size before hurricane = 7.3 mm Leaf size after hurricane = 7.1 mm

What is the ML estimate for sigma ^2 under my specified model?

# method 1
nlik <- function(x, data=0.2) {
  1/sqrt(2 * pi * x) * exp(-data^2/(2*x))
}
res <- optimize(nlik, interval=c(0.01, 5), maximum=T)
res

## $maximum
## [1] 0.03999487
## 
## $objective
## [1] 1.209854

# method 2
res <- optimize(nlik, interval=c(0.01, 5), maximum=T, data=0.2)
res

## $maximum
## [1] 0.03999487
## 
## $objective
## [1] 1.209854

# method 3
foo <- function(x) {dnorm(0.2, mean=0, sd=sqrt(x))}
res <- optimize(foo, interval=c(0.02, 5), maximum=T)

model selection

Alternative model: the trait changes due to genetic drift. Additional information: Ne = 1000, va = 10

Under drift, each generation we expect trait change to follow a normal distribution with variance va / Ne = 10 / 1000 = 0.01.

Under a drift hypothesis we have the same model but with \(\sigma^2 = 0.01\) as a fixed parameter.

This is a model with no free parameters.

Calculate the likelihood for this model when our amount of change is 0.2 mm.

# drift model
driftL <- nlik(0.1, data=0.2)

# alternative model
res <- optimize(nlik, interval=c(0.01, 5), maximum=T, data=0.2)
res

## $maximum
## [1] 0.03999487
## 
## $objective
## [1] 1.209854

# likelihood ratio test

delta <- 2*(log(res$objective) - log(driftL))
pchisq(delta, 1, lower.tail=F)

## [1] 0.5738454