% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/TADrfe.R
\name{TADrfe}
\alias{TADrfe}
\title{A wrapper function passed to \code{caret::rfe} to apply recursive feature
elimination (RFE) on binned domain data as a feature reduction technique for
random forests. Backward elimination is performed from p down to 2, by
powers of 2, where p is the number of features in the data.}
\usage{
TADrfe(
  trainData,
  tuneParams = list(ntree = 500, nodesize = 1),
  cvFolds = 5,
  cvMetric = "Accuracy",
  verbose = FALSE
)
}
\arguments{
\item{trainData}{Data frame, the binned data matrix to built a random forest
classifiers (can be obtained using \code{\link{createTADdata}}). Required.}

\item{tuneParams}{List, providing \code{ntree} and \code{nodesize}
parameters to feed into \code{\link{randomForest}}. Required.}

\item{cvFolds}{Numeric, number of k-fold cross-validation to perform.
Required.}

\item{cvMetric}{Character, performance metric to use to choose optimal
tuning parameters (one of either "Kappa", "Accuracy", "MCC","ROC","Sens",
"Spec", "Pos Pred Value", "Neg Pred Value"). Default is "Accuracy".}

\item{verbose}{Logical, controls whether or not details regarding modeling
should be printed out. Default is TRUE.}
}
\value{
A list containing: 1) the performances extracted at each of the k
folds and, 2) Variable importances among the top features at each step of
RFE. For 1) `Variables` - the best subset of features to consider at each
iteration, `MCC` (Matthews Correlation Coefficient), `ROC` (Area under the
receiver operating characteristic curve), `Sens` (Sensitivity), `Spec`
(Specificity), `Pos Pred Value` (Positive predictive value), `Neg Pred Value`
(Negative predictive value), `Accuracy`, and the corresponding standard
deviations across the cross-folds. For 2) `Overall` - the variable
importance, `var` - the feature name, `Variables` - the number of features
that were considered at each cross-fold, and `Resample` - the cross-fold
}
\description{
A wrapper function passed to \code{caret::rfe} to apply recursive feature
elimination (RFE) on binned domain data as a feature reduction technique for
random forests. Backward elimination is performed from p down to 2, by
powers of 2, where p is the number of features in the data.
}
\examples{
# Read in ARROWHEAD-called TADs at 5kb
data(arrowhead_gm12878_5kb)

#Extract unique boundaries
bounds.GR <- extractBoundaries(domains.mat = arrowhead_gm12878_5kb,
                               filter = FALSE,
                               CHR = "CHR22",
                               resolution = 5000)

# Read in GRangesList of 26 TFBS
data(tfbsList)

# Create the binned data matrix for CHR22 using:
# 5 kb binning,
# oc-type predictors from 26 different TFBS from the GM12878 cell line, and
# random under-sampling
tadData <- createTADdata(bounds.GR = bounds.GR,
                         resolution = 5000,
                         genomicElements.GR = tfbsList,
                         featureType = "oc",
                         resampling = "rus",
                         trainCHR = "CHR22",
                         predictCHR = NULL)

# Perform RFE for fully grown random forests with 100 trees using 5-fold CV
# Evaluate performances using accuracy
rfe_res <- TADrfe(trainData = tadData[[1]],
                  tuneParams = list(ntree = 100, nodesize = 1),
                  cvFolds = 5,
                  cvMetric = "Accuracy",
                  verbose = TRUE)
}
