Main Content

RegressionPartitionedEnsemble

Package:classreg.learning.partition
Superclasses:RegressionPartitionedModel

Cross-validated regression ensemble

Description

RegressionPartitionedEnsembleis a set of regression ensembles trained on cross-validated folds. Estimate the quality of classification by cross validation using one or more “kfold” methods:kfoldfun,kfoldLoss, orkfoldPredict. Every “kfold” method uses models trained on in-fold observations to predict response for out-of-fold observations. For example, suppose you cross validate using five folds. In this case, every training fold contains roughly 4/5 of the data and every test fold contains roughly 1/5 of the data. The first model stored inTrained{1}was trained onXandYwith the first 1/5 excluded, the second model stored inTrained{2}was trained onXandYwith the second 1/5 excluded, and so on. When you callkfoldPredict, it computes predictions for the first 1/5 of the data using the first model, for the second 1/5 of data using the second model and so on. In short, response for every observation is computed bykfoldPredictusing the model trained without this observation.

Construction

cvens= crossval(ens)creates a cross-validated ensemble fromens, a regression ensemble. For syntax details, see thecrossvalmethod reference page.

cvens= fitrensemble(X,Y,Name,Value)creates a cross-validated ensemble whenNameis one of'crossval','kfold','holdout','leaveout', or'cvpartition'. For syntax details, see thefitrensemblefunction reference page.

Input Arguments

ens

A regression ensemble constructed withfitrensemble.

Properties

BinEdges

Bin edges for numeric predictors, specified as a cell array ofpnumeric vectors, wherepis the number of predictors. Each vector includes the bin edges for a numeric predictor. The element in the cell array for a categorical predictor is empty because the software does not bin categorical predictors.

The software bins numeric predictors only if you specify the'NumBins'name-value argument as a positive integer scalar when training a model with tree learners. TheBinEdgesproperty is empty if the'NumBins'value is empty (default).

You can reproduce the binned predictor dataXbinnedby using theBinEdgesproperty of the trained modelmdl.

X = mdl.X; % Predictor data Xbinned = zeros(size(X)); edges = mdl.BinEdges; % Find indices of binned predictors. idxNumeric = find(~cellfun(@isempty,edges)); if iscolumn(idxNumeric) idxNumeric = idxNumeric'; end for j = idxNumeric x = X(:,j); % Convert x to array if x is a table. if istable(x) x = table2array(x); end % Group x into bins by using thediscretizefunction. xbinned = discretize(x,[-inf; edges{j}; inf]); Xbinned(:,j) = xbinned; end
Xbinnedcontains the bin indices, ranging from 1 to the number of bins, for numeric predictors.Xbinnedvalues are 0 for categorical predictors. IfXcontainsNaNs, then the correspondingXbinnedvalues areNaNs.

CategoricalPredictors

Categorical predictor indices, specified as a vector of positive integers.CategoricalPredictorscontains index values indicating that the corresponding predictors are categorical. The index values are between 1 andp, wherepis the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty ([]).

CrossValidatedModel

Name of the cross-validated model, a character vector.

Kfold

Number of folds used in a cross-validated tree, a positive integer.

ModelParameters

Object holding parameters oftree.

NumObservations

Numeric scalar containing the number of observations in the training data.

NumTrainedPerFold

Vector ofKfoldelements. Each entry contains the number of trained learners in this cross-validation fold.

Partition

The partition of classcvpartitionused in creating the cross-validated ensemble.

PredictorNames

A cell array of names for the predictor variables, in the order in which they appear inX.

ResponseName

Name of the response variableY, a character vector.

ResponseTransform

Function handle for transforming scores, or character vector representing a built-in transformation function.'none'means no transformation; equivalently,'none'means@(x)x.

Add or change aResponseTransformfunction using dot notation:

ens.ResponseTransform = @function

Trainable

Cell array of ensembles trained on cross-validation folds. Every ensemble is full, meaning it contains its training data and weights.

Trained

Cell array of compact ensembles trained on cross-validation folds.

W

The scaledweights, a vector with lengthn, the number of rows inX.

X

A matrix or table of predictor values. Each column ofXrepresents one variable, and each row represents one observation.

Y

A numeric column vector with the same number of rows asX. Each entry inYis the response to the data in the corresponding row ofX.

Object Functions

kfoldLoss Loss for cross-validated partitioned regression model
kfoldPredict 预测反应的观察cross-validated regression model
kfoldfun Cross-validate function for regression
resume Resume training ensemble

Copy Semantics

Value. To learn how value classes affect copy operations, seeCopying Objects.

Examples

collapse all

构造一个分区回归整体,examine the cross-validation losses for the folds.

Load thecarsmalldata set.

loadcarsmall;

Create a subset of variables.

XX = [Cylinders Displacement Horsepower Weight]; YY = MPG;

Construct the ensemble model.

rens = fitrensemble(XX,YY);

Create a cross-validated ensemble fromrens.

rng(10,'twister')% For reproducibilitycvrens = crossval(rens);

Examine the cross-validation losses.

L = kfoldLoss(cvrens,'mode','individual')
L =10×121.4489 48.4388 28.2560 17.5354 29.9441 49.5254 51.2372 31.0152 31.6388 8.9607

L is a vector containing the cross-validation loss for each trained learner in the ensemble.