Masthead

Creating Spatial Models with GAMs

This lab will introduce you to modeling continuous response variables with Generalized Additive Models or "GAMs". GAMs allow a level of flexibility that is not available with linear models and more control than polynomial models.

Refer to the "R for Spatial Stats" web site for information on running GAMs for this lab. You'll also want to review the sections on loops and sub-sampling data.

Note that you will need the mgcv library for this lab.

GAMs with Synthetic Data

Below is some code that will generate a polynomial up to the 4th order with optional noise injection.

#####################################################################
# Function to Create data from a polynomial up to order 4.
#
# Inputs:
#  NumEntries - The number of y-values to create
#  RandomStdDev - The standard of deviation for the random component
#  B0, B1, B2, B3, B4 - coefificents for the polynomial
#
# Outputs:
#  A DataFrame with a column for the Xs and another column for the "Measures" 
#
# By: Jim Graham
# February 23rd, 2014
#####################################################################
Poly2D=function(NumEntries=100,RandomStdDev=0,B0=0,B1=0,B2=0,B3=0,B4=0)
{
  Xs=as.vector(array(1:NumEntries)) # Create the independent variable
  Measures=as.vector(array(1:NumEntries)) # create the response variable
  
  for (Index in 1:NumEntries)
  {
    X=Xs[Index]
    Measures[Index]=B0+B1*X+B2*(X**2)+B3*(X**3)+B4*(X**4) # compute the response
    
    if (RandomStdDev>0) # if a random value was provided
    {
	  # add some raondomness
      Measures[Index]= Measures[Index]+rnorm(1, 0, sd = RandomStdDev);
    }
  }
 	# create the data frame
  TheDataFrame = data.frame(Xs, Measures)
  
  return(TheDataFrame)
} 

Use the code above to create a series of polynomials with different levels of noise as below. Then, create a GAM model as below and examine the model and the AIC values.

TheData=Poly2D(100,100,10,10,-.5,+0.004) # create some data
plot(TheData) # take a look at the data

TheModel=gam(Measures~s(Xs),data=TheData,gamma=1.4) # Recommended smoothing factor
plot(TheModel) # plot the model (how does it compare with the data?)

summary(TheModel)
AIC(TheModel)

Now, change the polynomial coeficients and the amount of randomness to see how well a GAM can model the effect in the data.

Your results will appear something like the following:

Family: gaussian   
Link function: identity     

Formula:  
Measures ~ s(Xs)    

Parametric coefficients: 
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -157.453      1.017  -154.8   <2e-16 ***  
---  
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1    

Approximate significance of smooth terms:
          edf Ref.df    F p-value      
s(Xs) 8.724  8.978 2635  <2e-16 ***  
---  
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.996   Deviance explained = 99.6%  
GCV = 125.22  Scale est. = 103.51    n = 100

The coefficnets are similar to a linear model as the spline is used to linearize the function.

Deviance explained is the proportion of the deviance for this model divided by the total possible deviance where deviance is the difference between the likelhood of a model that fits the data perfectly minus the liklihood of this model. See Generalized Additive Models: an introduction with R by Wood for more information.

What we are really interested in is AIC which is obtained with the AIC(TheModel) function call.

GAMs with Dour-Fir Data

Here is a zip file that contains the Doug-Fir data and R scripts with GAMs for one and two covariate models and associated graphs.

GAMs with Your Data

Using a data set with a continuous response varible (the covariates can be continuous or categorical), create a number of GAM models with different predictor layers and determine the "best" model. Try adjusting the "gamma" value to set the smoothing to a level you are comfortable with.

 

© Copyright 2018 HSU - All rights reserved.