resubPredict

Classify training data using trained classifier

collapse all in page

Syntax

label = resubPredict(Mdl)

[label,Score] = resubPredict(Mdl)

[label,Score] = resubPredict(Mdl,'IncludeInteractions',includeInteractions)

[label,Score,Cost] = resubPredict(Mdl)

Description

example

label= resubPredict(Mdl)returns a vector of predicted class labels (label) for the trained classification modelMdlusing the predictor data stored inMdl.X.

example

[label,Score] = resubPredict(Mdl)also returns classification scores.

example

[label,Score] = resubPredict(Mdl,'IncludeInteractions',includeInteractions)specifies whether to include interaction terms in computations. This syntax applies only to generalized additive models.

example

[label,Score,Cost] = resubPredict(Mdl)also returns the expected misclassification cost. This syntax applies only tok-nearest neighbor and naive Bayes models.

Examples

collapse all

Label Training Sample Observations of Naive Bayes Classifier

Open Live Script

Load thefisheririsdata set. CreateXas a numeric matrix that contains four measurements for 150 irises. CreateYas a cell array of character vectors that contains the corresponding iris species.

loadfisheririsX = meas; Y = species; rng('default')% For reproducibility

Train a naive Bayes classifier using the predictorsXand class labelsY. A recommended practice is to specify the class names.fitcnbassumes that each predictor is conditionally and normally distributed.

Mdl = fitcnb(X,Y,'ClassNames',{'setosa','versicolor','virginica'})

Mdl = ClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'setosa' 'versicolor' 'virginica'} ScoreTransform: 'none' NumObservations: 150 DistributionNames: {'normal' 'normal' 'normal' 'normal'} DistributionParameters: {3x4 cell} Properties, Methods

Mdlis a trainedClassificationNaiveBayesclassifier.

Predict the training sample labels.

label = resubPredict(Mdl);

Display the results for a random set of 10 observations.

idx = randsample(size(X,1),10); table(Y(idx),label(idx),'VariableNames',...{'True Label','Predicted Label'})

ans=10×2 tableTrue Label Predicted Label ______________ _______________ {'virginica' } {'virginica' } {'setosa' } {'setosa' } {'virginica' } {'virginica' } {'versicolor'} {'versicolor'} {'virginica' } {'virginica' } {'versicolor'} {'versicolor'} {'virginica' } {'virginica' } {'setosa' } {'setosa' } {'virginica' } {'virginica' } {'setosa' } {'setosa' }

Create a confusion chart from the true labelsYand the predicted labelslabel.

cm = confusionchart(Y,label);

Figure contains an object of type ConfusionMatrixChart.

Estimate In-Sample Posterior Probabilities of SVM Classifier

Open Live Script

Load theionospheredata set. This data set has 34 predictors and 351 binary responses for radar returns, either bad ('b') or good ('g').

loadionosphere

Train a support vector machine (SVM) classifier. Standardize the data and specify that'g'is the positive class.

SVMModel = fitcsvm(X,Y,'ClassNames',{'b','g'},'Standardize',true);

SVMModelis aClassificationSVMclassifier.

Fit the optimal score-to-posterior-probability transformation function.

rng(1);% For reproducibilityScoreSVMModel = fitPosterior(SVMModel)

ScoreSVMModel = ClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: '@(S)sigmoid(S,-9.482430e-01,-1.217774e-01)' NumObservations: 351 Alpha: [90x1 double] Bias: -0.1342 KernelParameters: [1x1 struct] Mu: [0.8917 0 0.6413 0.0444 0.6011 0.1159 0.5501 ... ] Sigma: [0.3112 0 0.4977 0.4414 0.5199 0.4608 0.4927 ... ] BoxConstraints: [351x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [351x1 logical] Solver: 'SMO' Properties, Methods

Because the classes are inseparable, the score transformation function (ScoreSVMModel.ScoreTransform) is the sigmoid function.

Estimate scores and positive class posterior probabilities for the training data. Display the results for the first 10 observations.

[label,scores] = resubPredict(SVMModel); [~,postProbs] = resubPredict(ScoreSVMModel); table(Y(1:10),label(1:10),scores(1:10,2),postProbs(1:10,2),'VariableNames',...{'TrueLabel','PredictedLabel','Score','PosteriorProbability'})

ans=10×4 tableTrueLabel PredictedLabel Score PosteriorProbability _________ ______________ _______ ____________________ {'g'} {'g'} 1.4862 0.82216 {'b'} {'b'} -1.0003 0.30433 {'g'} {'g'} 1.8685 0.86917 {'b'} {'b'} -2.6457 0.084171 {'g'} {'g'} 1.2807 0.79186 {'b'} {'b'} -1.4616 0.22025 {'g'} {'g'} 2.1674 0.89816 {'b'} {'b'} -5.7085 0.00501 {'g'} {'g'} 2.4798 0.92224 {'b'} {'b'} -2.7812 0.074781

Compare GAMs by Examining Logit of Posterior Probabilities

Open Live Script

Estimate the logit of posterior probabilities (classification scores) for training data using a classification generalized additive model (GAM) that contains both linear and interaction terms for predictors. Specify whether to include interaction terms when computing the classification scores.

Load theionospheredata set. This data set has 34 predictors and 351 binary responses for radar returns, either bad ('b') or good ('g').

loadionosphere

Train a GAM using the predictorsXand class labelsY. A recommended practice is to specify the class names. Specify to include the 10 most important interaction terms.

Mdl = fitcgam(X,Y,'ClassNames',{'b','g'},'Interactions',10)

Mdl = ClassificationGAM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'logit' Intercept: 3.2565 Interactions: [10x2 double] NumObservations: 351 Properties, Methods

Mdlis aClassificationGAMmodel object.

Predict the labels using both linear and interaction terms, and then using only linear terms. To exclude interaction terms, specify'IncludeInteractions',false. Estimate the logit of posterior probabilities by specifying theScoreTransformproperty as'none'.

Mdl.ScoreTransform ='none'; [labels,scores] = resubPredict(Mdl); [labels_nointeraction,scores_nointeraction] = resubPredict(Mdl,'IncludeInteractions',false);

Create a table containing the true labels, predicted labels, and scores. Display the first eight rows of the table.

t = table(Y,labels,scores,labels_nointeraction,scores_nointeraction,...'VariableNames',{'True Labels','Predicted Labels','Scores'...'Predicted Labels Without Interactions','Scores Without Interactions'}); head(t)

ans=8×5 table真正的标签标签预测分数预测拉贝河ls Without Interactions Scores Without Interactions ___________ ________________ __________________ _____________________________________ ___________________________ {'g'} {'g'} -51.628 51.628 {'g'} -47.676 47.676 {'b'} {'b'} 37.433 -37.433 {'b'} 36.435 -36.435 {'g'} {'g'} -62.061 62.061 {'g'} -58.357 58.357 {'b'} {'b'} 37.666 -37.666 {'b'} 36.297 -36.297 {'g'} {'g'} -47.361 47.361 {'g'} -43.373 43.373 {'b'} {'b'} 106.48 -106.48 {'b'} 102.43 -102.43 {'g'} {'g'} -62.665 62.665 {'g'} -58.377 58.377 {'b'} {'b'} 201.46 -201.46 {'b'} 197.84 -197.84

The predicted labels for the training dataXdo not vary depending on the inclusion of interaction terms, but the estimated score values are different.

Estimate In-Sample Posterior Probabilities and Misclassification Costs of Naive Bayes Classifier

Open Live Script

Estimate in-sample posterior probabilities and misclassification costs using a naive Bayes classifier.

Load thefisheririsdata set. CreateXas a numeric matrix that contains four petal measurements for 150 irises. CreateYas a cell array of character vectors that contains the corresponding iris species.

loadfisheririsX = meas; Y = species; rng('default')% For reproducibility

Mdl = fitcnb(X,Y,'ClassNames',{'setosa','versicolor','virginica'});

Mdlis a trainedClassificationNaiveBayesclassifier.

Estimate the posterior probabilities and expected misclassification costs for the training data.

[label,Posterior,MisclassCost] = resubPredict(Mdl); Mdl.ClassNames

ans =3x1 cell{'setosa' } {'versicolor'} {'virginica' }

随机选择10观察显示结果vations.

idx = randsample(size(X,1),10); table(Y(idx),label(idx),Posterior(idx,:),MisclassCost(idx,:),'VariableNames',...{'TrueLabel','PredictedLabel','PosteriorProbability','MisclassificationCost'})

ans=10×4 tableTrueLabel PredictedLabel PosteriorProbability MisclassificationCost ______________ ______________ _________________________________________ ______________________________________ {'virginica' } {'virginica' } 6.2514e-269 1.1709e-09 1 1 1 1.1709e-09 {'setosa' } {'setosa' } 1 5.5339e-19 2.485e-25 5.5339e-19 1 1 {'virginica' } {'virginica' } 7.4191e-249 1.4481e-10 1 1 1 1.4481e-10 {'versicolor'} {'versicolor'} 3.4472e-62 0.99997 3.362e-05 1 3.362e-05 0.99997 {'virginica' } {'virginica' } 3.4268e-229 6.597e-09 1 1 1 6.597e-09 {'versicolor'} {'versicolor'} 6.0941e-77 0.9998 0.00019663 1 0.00019663 0.9998 {'virginica' } {'virginica' } 1.3467e-167 0.002187 0.99781 1 0.99781 0.002187 {'setosa' } {'setosa' } 1 1.5776e-15 5.7172e-24 1.5776e-15 1 1 {'virginica' } {'virginica' } 2.0116e-232 2.6206e-10 1 1 1 2.6206e-10 {'setosa' } {'setosa' } 1 1.8085e-17 1.9639e-24 1.8085e-17 1 1

The order of the columns ofPosteriorandMisclassCostcorresponds to the order of the classes inMdl.ClassNames.

Input Arguments

collapse all

`Mdl`—Classification machine learning model
full classification model object

Classification machine learning model, specified as a full classification model object, as given in the following table of supported models.

Model	Classification Model Object
Generalized additive model	`ClassificationGAM`
k-nearest neighbor model	`ClassificationKNN`
Naive Bayes model	`ClassificationNaiveBayes`
Neural network model	`ClassificationNeuralNetwork`
万博1manbetx看到下面成了一个支持向量机和二进制classification	`ClassificationSVM`

`includeInteractions`—Flag to include interaction terms
`true`|`false`

Flag to include interaction terms of the model, specified astrueorfalse. This argument is valid only for a generalized additive model (GAM). That is, you can specify this argument only whenMdlisClassificationGAM.

The default value istrueifMdlcontains interaction terms. The value must befalseif the model does not contain interaction terms.

Data Types:logical

Output Arguments

collapse all

`label`— Predicted class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

Predicted class labels, returned as a categorical or character array, logical or numeric vector, or cell array of character vectors.

labelhas the same data type as the observed class labels that trainedMdl, and its length is equal to the number of observations inMdl.X.(The software treats string arrays as cell arrays of character vectors.)

`Score`— Class scores
numeric matrix

Class scores, returned as a numeric matrix.Scorehas rows equal to the number of observations inMdl.Xand columns equal to the number of distinct classes in the training data (size(Mdl.ClassNames,1)).

`Cost`— Expected misclassification costs
numeric matrix

Expected misclassification costs, returned as a numeric matrix. This output applies only tok-nearest neighbor and naive Bayes models. That is,resubPredictreturnsCostonly whenMdlisClassificationKNNorClassificationNaiveBayes.

Costhas rows equal to the number of observations inMdl.Xand columns equal to the number of distinct classes in the training data (size(Mdl.ClassNames,1)).

Cost(j,k)is the expected misclassification cost of the observation in rowjofMdl.Xpredicted into classk(in classMdl.ClassNames(k)).

Algorithms

resubPredictcomputes predictions according to the correspondingpredictfunction of the object (Mdl). For a model-specific description, see thepredictfunction reference pages in the following table.

Model	Classification Model Object (`Mdl`)	`predict`Object Function
Generalized additive model	`ClassificationGAM`	`predict`
k-nearest neighbor model	`ClassificationKNN`	`predict`
Naive Bayes model	`ClassificationNaiveBayes`	`predict`
Neural network model	`ClassificationNeuralNetwork`	`predict`
万博1manbetx看到下面成了一个支持向量机和二进制classification	`ClassificationSVM`	`predict`

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

This function fully supports GPU arrays for a trained classification model specified as aClassificationKNNorClassificationSVMobject.

For more information, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).

Version History

Introduced in R2012a

resubPredict

Syntax

Description

Examples

Label Training Sample Observations of Naive Bayes Classifier

Estimate In-Sample Posterior Probabilities of SVM Classifier

Compare GAMs by Examining Logit of Posterior Probabilities

Estimate In-Sample Posterior Probabilities and Misclassification Costs of Naive Bayes Classifier

Input Arguments

Mdl—Classification machine learning modelfull classification model object

includeInteractions—Flag to include interaction termstrue|false

Output Arguments

label— Predicted class labelscategorical array | character array | logical vector | numeric vector | cell array of character vectors

Score— Class scoresnumeric matrix

Cost— Expected misclassification costsnumeric matrix

Algorithms

Extended Capabilities

GPU ArraysAccelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

`Mdl`—Classification machine learning model
full classification model object

`includeInteractions`—Flag to include interaction terms
`true`|`false`

`label`— Predicted class labels
categorical array | character array | logical vector | numeric vector | cell array of character vectors

`Score`— Class scores
numeric matrix

`Cost`— Expected misclassification costs
numeric matrix

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.