Skip to contents

A function to generate a ModellingParams object

Usage

generateModellingParams(
  assayIDs,
  measurements,
  nFeatures,
  selectionMethod,
  selectionOptimisation,
  performanceType = "auto",
  classifier,
  multiViewMethod = "none"
)

Arguments

assayIDs

A vector of data set identifiers as long at the number of data sets.

measurements

Either a DataFrame, data.frame, matrix, MultiAssayExperiment or a list of these objects containing the data.

nFeatures

The number of features to be used for classification. If this is a single number, the same number of features will be used for all comparisons or assays. If a numeric vector these will be optimised over using selectionOptimisation. If a named vector with the same names of multiple assays, a different number of features will be used for each assay. If a named list of vectors, the respective number of features will be optimised over. Set to NULL or "all" if all features should be used.

selectionMethod

Default: "auto". A character vector of feature selection methods to compare. If a named character vector with names corresponding to different assays, and performing multiview classification, the respective classification methods will be used on each assay. If "auto" t-test (two categories) / F-test (three or more categories) ranking and top nFeatures optimisation is done. Otherwise, the ranking method is per-feature Cox proportional hazards p-value.

selectionOptimisation

A character of "Resubstitution", "Nested CV" or "none" specifying the approach used to optimise nFeatures.

performanceType

Performance metric to optimise if classifier has any tuning parameters.

classifier

Default: "auto". A character vector of classification methods to compare. If a named character vector with names corresponding to different assays, and performing multiview classification, the respective classification methods will be used on each assay. If "auto", then a random forest is used for a classification task or Cox proportional hazards model for a survival task.

multiViewMethod

A character vector specifying the multiview method or data integration approach to use.

Value

ModellingParams object

Examples

data(asthma)
# First make a toy example assay with multiple data types. We'll randomly assign different features to be clinical, gene or protein.
set.seed(51773)
measurements <- DataFrame(measurements, check.names = FALSE) 
mcols(measurements)$assay <- c(rep("clinical",20),sample(c("gene", "protein"), ncol(measurements)-20, replace = TRUE))
mcols(measurements)$feature <- colnames(measurements)
modellingParams <- generateModellingParams(assayIDs = c("clinical", "gene", "protein"),
                                          measurements = measurements, 
                                          nFeatures = list(clinical = 10, gene = 10, protein = 10),
                                          selectionMethod = list(clinical = "t-test", gene = "t-test", protein = "t-test"),
                                          selectionOptimisation = "none",
                                          classifier = "randomForest",
                                          multiViewMethod = "merge")
#> Error in generateModellingParams(assayIDs = c("clinical", "gene", "protein"),     measurements = measurements, nFeatures = list(clinical = 10,         gene = 10, protein = 10), selectionMethod = list(clinical = "t-test",         gene = "t-test", protein = "t-test"), selectionOptimisation = "none",     classifier = "randomForest", multiViewMethod = "merge"): could not find function "generateModellingParams"