
Permutation Feature Importance for Classification with Parallel Factor Analysis
pficpfa.RdCalculates permutation feature importance for results from a 'wrapcpfa' object
generated by function cpfa. Allows for calculation of a conditional
permutation feature importance as an alternative.
Arguments
- object
An object of class 'wrapcpfa' from function
cpfa.- nshuffles
Single positive integer specifying the number of times each feature (or its conditional residuals) is permuted per replication. Permutation feature importance is averaged over these shuffles. Defaults to 10.
- type
Character specifying type of permutation feature importance (PFI) to calculate. If
type = "marginal", calculates regular PFI by shuffling each feature's observations. Iftype = "conditional", first fits a conditioning model to predict a given feature using all other features. Then, randomly shuffles the model's residuals. Reconstructs a given feature as the sum of its predicted values and shuffled residuals. Calculates PFI using this reconstructed feature, which is a conditional calculation. Defaults tomarginal.- conditional.model
Character indicating the conditioning model to use when
type = "conditional". Forconditional.model = "ridge", uses ridge regression as the conditioning model. Forconditional.model = "rf", uses random forest as the conditioning model. Defaults to"ridge".- ridge.lambda
Single, numeric real number greater than zero indicating the ridge regression penalty when both
type = "conditional"andconditional.model = "ridge". Defaults toridge.lambda = 1e-4.- ntree
Single, numeric integer greater than zero indicating the random forest number of trees parameter when both
type = "conditional"andconditional.model = "rf". Defaults tontree = 500.- nodesize
Single, numeric integer greater than zero indicating the random forest node size parameter when both
type = "conditional"andconditional.model = "rf". Defaults tonodesize = 5.- safealign
Logical indicating whether to remove replications for component models from
objectbased on the values oftccb(see help file for functioncpfafor more information). Defaults to FALSE.- safealign.stat
Character indicating the statistic to use to remove replications for component models from
objectbased on the values oftccb. For example, whensafealign.stat = "min", for each component model, the minimum value oftccbis identified for each replication. Defaults to"min".- safealign.threshold
Single, positive real number between 0 and 1, exclusive. Indicates threshold used to remove replications for component models from
objectbased on the values oftccb. For example, ifsafealign.stat = "median", for each component model and forsafealign.threshold = 0.9, any replications with a mediantccbvalue below0.9are removed from feature importance calculations. Defaults to 0.9.- parallel
Logical indicating if parallel computing should be implemented. Defaults to FALSE, which implements sequential computing.
- cl
Cluster for parallel computing, which is used when
parallel = TRUE. Note that ifparallel = TRUEandcl = NULL, then the cluster is defined asmakeCluster(max(1L, detectCores() - 1L)).
Details
Function pficpfa measures each feature's contribution to
classification performance via permutation feature importance (PFI). The
function requires a 'wrapcpfa' object from the function cpfa
where argument align = TRUE. For each replication and component model,
the predicted weight matrix (consisting of components first, then features
from argument z in cpfa) contains the features. Each feature is
permuted nshuffles times; the importance is the resulting change in
each performance measure, averaged over permutations. When
type = "marginal", the feature is permuted directly, corresponding to
the feature importance of Breiman (2001). When type = "conditional", a
conditioning model (ridge regression or random forest) predicts the feature
from the others; and the model's residuals are permuted and added back to the
fitted values, preserving the feature's dependence on other features (for more
information, see Huang, 2025, and O'Gorman, 2005). When
safealign = TRUE, replications are screened per component model
using the Tucker congruence coefficients in object$tccb: replications
whose safealign.stat value does not exceed safealign.threshold
are excluded. When parallel = TRUE, parallel computing is used;
otherwise, sequential computing is used.
Value
Returns a data frame containing the following nine variables: (1) nfac, a
numeric indicating the number of components in the model; (2) feature, an
integer indexing a given feature; (3) method, a character indicating the
classification method; (4) metric, a character indicating the classification
performance measure used for permutation feature importance (see help file
for function cpm for more information); (5) mean, a numeric value that
is the mean permutation feature importance among all valid replications; (6)
median, a numeric value that is the median permutation feature importance
among all valid replications; (7) sd, a numeric value that is the standard
deviation among all permutation feature importance values for valid
replications; (8) n, an integer indicating the number of replications
available for calculating feature importance after removing replications via
safealign; and (9) nvalid, an integer indicating the actual number of
replications used for the calculation after removing any replications
resulting in NaN.
References
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
Huang, P. (2025). Residual permutation tests for feature importance in machine learning. British Journal of Mathematical and Statistical Psychology.
O'Gorman, T. (2005). The performance of randomization tests that use permutations of independent variables. Communications in Statistics - Simulation and Computation, 34(4), 895-908.
Examples
########## Parafac2 example with 4-way array and multiclass response ##########
if (FALSE) { # \dontrun{
# set seed
set.seed(5)
# define list of arguments specifying distributions for A and G weights
techlist <- list(distA = list(dname = "poisson",
lambda = 3), # for A weights
distG = list(dname = "gamma", shape = 2,
scale = 4)) # for G weights
# define target correlation matrix for columns of D mode weights matrix
cormat <- matrix(c(1, .6, .6, .6, 1, .6, .6, .6, 1), nrow = 3, ncol = 3)
# simulate a four-way ragged array connected to a response
data <- simcpfa(arraydim = c(10, 11, 12, 100), model = "parafac2", nfac = 3,
nclass = 3, nreps = 1e2, onreps = 10, corresp = rep(.6, 3),
meanpred = rep(2, 3), modes = 4, corrpred = cormat,
technical = techlist, smethod = "eigende")
# initialize
alpha <- seq(0, 1, length = 20)
gamma <- c(0, 1)
cost <- c(0.1, 5)
ntree <- c(200, 300)
nodesize <- c(1, 2)
size <- c(1, 2)
decay <- c(0, 1)
rda.alpha <- seq(0.1, 0.9, length = 2)
delta <- c(0.1, 2)
eta <- c(0.3, 0.7)
max.depth <- c(1, 2)
subsample <- c(0.75)
nrounds <- c(100)
method <- c("PLR", "SVM", "RF", "NN", "RDA", "GBM")
family <- "multinomial"
parameters <- list(alpha = alpha, gamma = gamma, cost = cost, ntree = ntree,
nodesize = nodesize, size = size, decay = decay,
rda.alpha = rda.alpha, delta = delta, eta = eta,
max.depth = max.depth, subsample = subsample,
nrounds = nrounds)
model <- "parafac2"
nfolds <- 10
nstart <- 10
# constrain first mode weights to be orthogonal, fourth mode to be nonnegative
const <- c("orthog", "uncons", "uncons", "nonneg")
# fit Parafac2 model and use fourth mode weights to tune classification
# methods, to predict class labels, and to return classification
# performance measures pooled across multiple train-test splits
output <- cpfa(x = data$X, y = data$y, model = model, nfac = 3,
nrep = 5, ratio = 0.9, nfolds = nfolds, method = method,
family = family, parameters = parameters, align = TRUE,
type.out = "descriptives", seeds = NULL, plot.out = TRUE,
parallel = FALSE, const = const, nstart = nstart)
# calculate permutation feature importance for output
pfistats <- pficpfa(output, nshuffles = 5, type = "marginal", safealign = FALSE)
} # }