This lab will introduce you to modeling presence/absence data with GLMs. This is also the first full-modeling lab. A key element of this lab is examining the response of the data vs. the predicted output of the model and how both relate to the predictor variables.
Note: Remember to include the glm2 library in the code below.
The function below will create synthetic presence/absence data for evaluating GLMs. Copy it into R now and compile them.
############################################################################
# Creates a data frame with Xs ranging from 1 to the number of entries
# and "Measures" with 1/2 set to 0 and 1/2 set to 1. The lower have of
# the X values have measures at 0.
# ProportionRandom - amount of uniform randomness to add
############################################################################
Categories1D_Random=function(NumEntries=10,ProportionRandom=0.4)
{
Range=NumEntries*ProportionRandom/2
Ys=as.vector(array(1:NumEntries))
Xs=as.vector(array(1:NumEntries))
Measures=as.vector(array(1:NumEntries))
for (Index in 1:NumEntries)
{
Xs[Index]=Index #runif(1,0,100)
Ys[Index]=Index #runif(1,0,100)
Threshold=0.5
Random=0
if (ProportionRandom!=0) Random=runif(1,-Range,Range)
if (Xs[Index]>NumEntries/2+Random) Measures[Index]=1
else Measures[Index]=0
}
TheDataFrame = data.frame(Ys, Xs, Measures)
}
The code below will create a synthetic data set for a logistic model using the function above. Try it now in R. Note that the data contains some randomness to make the values overlap a bit.
TheData=Categories1D_Random(100) # create a set of binary data plot(TheData$Xs,TheData$Measures) # plot the data
© Copyright 2018 HSU - All rights reserved.