#################
# Example using the goodness-of-fit test of Bondell,
# "Testing goodness-of-fit in logistic case-control studies",
# Biometrika, 2007.
# The R.functions needed can be found in the file DistFit.R and must be loaded to use this example.
#################
#################
# The example using the Kyphosis data is illustrated here.
# The data is in the rpart library.
# The loading of the library is only needed to get this data, it is not needed to run the procedure.
#################
library(rpart)
y<-as.vector(as.numeric(kyphosis$Kyphosis=='present'))
x<-cbind(kyphosis$Age,kyphosis$Number,kyphosis$Start)
#####################
# The call to the function is given below.
# INPUT:
# x - a matrix (the design matrix) not including a column for the intercept. The intercept will automatically be included.
# y - vector of responses (must be zeros and ones)
# OUTPUT:
# theta - vector of MLE regression coefficients starting with the intercept. These should be the same as given by a call to GLM.
# test.statistic - value of the goodness-of-fit test statistic
# null.dist - ordered bootstrap null distribution of the test statistic based on 2000 bootstrap samples.
# p.value.test - p-value for the null hypothesis based on the bootstrap. The null hypothesis is that the logistic regression model holds, so # small p.values are evidence for lack-of-fit
#########################
goodness.fit.test<-min.dist.test(x,y)
goodness.fit.test$theta
# [1] -2.03693352 0.01093048 0.41060119 -0.20651005
goodness.fit.test$test.statistic
# [1] 0.04116431
goodness.fit.test$p.value.test
# [1] 0.007
######################
# Note that the p.value will differ since it is based on a bootstrap sample, but should be similar since 2000 samples are used.
# The parameter estimates and test statistic will not change, so should match the above results.
######################