Chapter 8 Feature Selection Example

Feature Selection in R and Caret

library(caret)
library(doParallel) # parallel processing
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
library(dplyr) # Used by caret
library(pROC) # plot the ROC curve
## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var
library(foreign)

### Use the segmentationData from caret
# Load the data and construct indices to divided it into training and test data sets.
#set.seed(10)
kc1 <- read.arff("./datasets/defectPred/D1/KC1.arff")
 inTrain <- createDataPartition(y = kc1$Defective,
  ## the outcome data are needed
  p = .75,
  ## The percentage of data in the
  ## training set
  list = FALSE)

The function createDataPartition does a stratified partitions.

training <- kc1[inTrain,]
nrow(training)
## [1] 1573
testing <- kc1[-inTrain, ]
nrow(testing)
## [1] 523

The train function can be used to + evaluate, using resampling, the effect of model tuning parameters on performance + choose the “optimal” model across these parameters + estimate model performance from a training set

fitControl <- trainControl(## 10-fold CV
                           method = "repeatedcv",
                           number = 10,
                           ## repeated ten times
                           repeats = 10)

gbmFit1 <- train(Defective ~ ., data = training, method = “gbm”, trControl = fitControl, ## This last option is actually one ## for gbm() that passes through verbose = FALSE) gbmFit1

plsFit <- train(Defective ~ .,
 data = training,
 method = "pls",
 ## Center and scale the predictors for the training
 ## set and all future samples.
 preProc = c("center", "scale")
)
# To fix
# testPred <- predict(plsFit, testing)
# postResample(testPred, testing$Defective)
# sensitivity(testPred, testing$Defective)
# confusionMatrix(testPred, testing$Defective)

When there are three or more classes, confusion matrix will show the confusion matrix and a set of “one-versus-all” results.