oobPermutedPredictorImportance

通过禁止禁止造影的释放对回归树的随机森林的禁令预测重点估计

展开所有页面

Syntax

Imp = OobperMutedPredictorimportance（MDL）

Imp = OobperMutedPredictorimportance（MDL，名称，价值）

描述

偶尔= OobperMutedPredictorimportance（Mdl)返回矢量的矢量out-of-bag, predictor importance estimates by permutation使用the random forest of regression treesMdl。Mdlmust be a回归BaggedEnsemble模型对象。

example

偶尔= OobperMutedPredictorimportance（Mdl,名称,Value)使用一个或多个指定的其他选项名称,Value对论点。例如，您可以使用并行计算加速计算，或者指示要在预测的重点估计中使用的树。

Input Arguments

展开全部

`Mdl`—回归树的随机森林
`回归BaggedEnsemble`model object

回归树的随机森林，指定为a回归BaggedEnsemble模型对象创建fitrensemble。

名称-Value Pair Arguments

Specify optional comma-separated pairs of名称,Value论点。名称is the argument name andValue是相应的价值。名称must appear inside quotes. You can specify several name and value pair arguments in any order asname1，value1，...，namen，valuen。

`“学习者”`—用于预测的学习者的指标重视估算
`1:Mdl.NumTrained`（默认）|正整数的数字矢量

学习者指定用于预测的重要性估计，指定为逗号分隔的对“学习者”和一个正整数的数字矢量。价值必须最多mdl.numtromed.。WhenoobPermutedPredictorImportanceestimates the predictor importance, it includes the learners inMdl.Trained(learners)只在哪里learnersis the value of“学习者”。

例：'学习者'，[1：2：mdl.numtromed]

`'选项'`—并行计算选项
`[]`（默认）|structure array returned by`实例化`

并行计算选项, specified as the comma-separated pair consisting of'选项'和a structure array returned by实例化。'选项'需要并行计算工具箱™许可证。

oobPermutedPredictorImportance用来'UseParallel'field only.statset（'deverpecallell'，true）调用一池工人。

例：'选项'，statset（'deverypallellel'，true）

输出参数

展开全部

`偶尔`- 禁止袋，以排列值预测重要性估计
numeric vector

禁止的禁止，预测值估算，返回一个1-by-pnumeric vector.p是培训数据中的预测变量的数量（尺寸（mdl.x，2）).IMP（j)是预测因子的预测重要性Mdl.PredictorNames(j)。

例子

展开全部

估计预测因子的重要性

Open Live Script

加载Carsmall.数据集。考虑一种模型，该模型预测汽车的平均燃料经济性，仪式，气缸数量，发动机位移，马力，制造商，模型年和重量。考虑气瓶,MFG., andmodel_year.as categorical variables.

loadCarsmall.气瓶= categorical(Cylinders); Mfg = categorical(cellstr(Mfg)); Model_Year = categorical(Model_Year); X = table(Acceleration,Cylinders,Displacement,Horsepower,Mfg,......model_year，重量，mpg）;

您可以使用整个数据集培训500个回归树的随机森林。

Mdl = fitrensemble(X,“英里”,'方法','袋','numlearnicalnycle'，500）;

fitrensemble使用默认模板树对象Templatetree（）as a weak learner when'方法'is'袋'。在此示例中，为了再现性，请指定'可重复'，真实创建树模板对象时，然后将对象用作弱的学习者。

RNG（'默认')重复性的％t = templateTree('可重复'，真正）;% For reproducibiliy of random predictor selectionsMdl = fitrensemble(X,“英里”,'方法','袋','numlearnicalnycle'，500，“学习者”,t);

Mdl是A.回归BaggedEnsemble模型。

通过禁用外袋观察来估算预测的重要措施。使用条形图比较估计值。

偶尔= OobperMutedPredictorimportance（Mdl); figure; bar(imp); title('不禁止允许的预测标志重要估计'）;ylabel（'Estimates'）;xlabel('预测者'）;h = gca; h.XTickLabel = Mdl.PredictorNames; h.XTickLabelRotation = 45; h.TickLabelInterpreter ='none';

偶尔是一个1比7的预测重要性估计矢量。更大的值表示对预测产生更大影响的预测因子。在这种情况下，Weight是最重要的预测因素，其次是model_year.。

Unbiased Estimates of Predictor Importance Using Parallel Computing

此示例使用：

Open Live Script

loadCarsmall.气瓶= categorical(Cylinders); Mfg = categorical(cellstr(Mfg)); Model_Year = categorical(Model_Year); X = table(Acceleration,Cylinders,Displacement,Horsepower,Mfg,......model_year，重量，mpg）;

Display the number of categories represented in the categorical variables.

numCylinders = numel(categories(Cylinders))

numcylinders = 3

numMfg = numel(categories(Mfg))

numMfg = 28

nummodelyear = numel（类别（model_year））

numModelYear = 3

因为只有3个类别气瓶和model_year.，标准推车，预测算法更喜欢在这两个变量上分割连续的预测器。

使用整个数据集培训500个回归树的随机森林。为了种植无偏的树木，请指定用于分裂预测器的曲率测试的使用。由于数据中存在缺少值，因此指定代理分割的使用。要重现随机预测器选择，请使用随机数发生器的种子rng和specify'可重复'，真实。

RNG（'默认'）;重复性的％t = templateTree('PredictorSelection','curvature','代理','on',......'可重复'，真正）;随机预测器选择的再现性的％Mdl = fitrensemble(X,“英里”,'方法','袋','numlearnicalnycle'，500，......“学习者”,t);

通过禁用外袋观察来估算预测的重要措施。并行执行计算。

选项= statset（'UseParallel'，真正）;Imp = OobperMutedPredictorimportance（MDL，'选项'，选择）;

Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 6).

使用条形图比较估计值。

figure; bar(imp); title('不禁止允许的预测标志重要估计'）;ylabel（'Estimates'）;xlabel('预测者'）;h = gca; h.XTickLabel = Mdl.PredictorNames; h.XTickLabelRotation = 45; h.TickLabelInterpreter ='none';

在这种情况下，model_year.是最重要的预测因素，其次是气瓶。将这些结果与结果进行比较估计预测因子的重要性。

More About

展开全部

Out-of-Bag, Predictor Importance Estimates by Permutation

禁止的禁止，预测值估算测量模型中预测变量的影响程度如何预测响应。预测器的影响随着该措施的价值而增加。

If a predictor is influential in prediction, then permuting its values should affect the model error. If a predictor is not influential, then permuting its values should have little to no effect on the model error.

以下过程描述了通过置换估计袋出预测值的重要性值。假设R是一个随机的森林Tlearners andp是培训数据中的预测器数量。

为树t,t= 1，......，T:
1. 识别出袋子外观察和分裂的预测变量的指标t,s_t⊆{1，......，p}。
2. 估计禁止的错误ε._t。
3. 对于每个预测变量x_j,jε.s_t:
  1. 随机遵守观察x_j。
  2. Estimate the model error,ε._TJ.，使用包含允许值的袋子观察x_j。
  3. 取消差异d_TJ.=ε._TJ.–ε._t。Predictor variables not split when growing treet归因于0的差异。
对于每个预测变量in the training data, compute the mean, ${\bar{d}}_{j}$ , and standard deviation,σ._j，对学习者的差异，j= 1，......，p。
袋子外的预测因子因排列而重要性x_jis ${\bar{d}}_{j} / {σ.}_{j}$ 。

提示

在使用随机森林时使用fitrensemble:

Standard CART tends to select split predictors containing many distinct values, e.g., continuous variables, over those containing few distinct values, e.g., categorical variables[3]。如果预测器数据集是异构的，或者如果存在与其他变量相对较少的不同值的预测器，则考虑指定曲率或交互测试。
Trees grown using standard CART are not sensitive to predictor variable interactions. Also, such trees are less likely to identify important variables in the presence of many irrelevant predictors than the application of the interaction test. Therefore, to account for predictor interactions and identify importance variables in the presence of many irrelevant variables, specify the interaction test[2]。
如果培训数据包括许多预测因子并且您想要分析预测的重要性，则指定'numvariablestosample'of theTemplatetree.功能'all'对于合奏的树学习者。否则，软件可能无法选择一些预测器，低估了他们的重要性。

有关更多详细信息，请参阅Templatetree.和选择分割预测器选择技术。

参考资料

[1] Breiman，L.，J.Friedman，R. Olshen和C. Stone。Classification and Regression Trees。Boca Raton, FL: CRC Press, 1984.

[2] LOH，W.Y.“具有无偏的变量选择和相互作用检测的回归树。”STATISTICA SINICA.，卷。12, 2002, pp. 361–386.

[3] Loh, W.Y. and Y.S. Shih. “Split Selection Methods for Classification Trees.”STATISTICA SINICA.，卷。7，1997，第815-840页。

扩展能力

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

To run in parallel, set the'UseParallel'选择true。

设定'UseParallel'选项结构的领域true使用实例化并指定这一点'选项'对此函数调用中的名称值对参数。

例如：'选项'，statset（'deverypallellel'，true）

For more information, see the'选项'name-value pair argument.

For more general information about parallel computing, seeRun MATLAB Functions with Automatic Parallel Support（并行计算工具箱）。

oobPermutedPredictorImportance

Syntax

描述

Input Arguments

`Mdl`—回归树的随机森林
`回归BaggedEnsemble`model object

名称-Value Pair Arguments

`“学习者”`—用于预测的学习者的指标重视估算
`1:Mdl.NumTrained`（默认）|正整数的数字矢量

`'选项'`—并行计算选项
`[]`（默认）|structure array returned by`实例化`

输出参数

`偶尔`- 禁止袋，以排列值预测重要性估计
numeric vector

例子

估计预测因子的重要性

Unbiased Estimates of Predictor Importance Using Parallel Computing

More About

Out-of-Bag, Predictor Importance Estimates by Permutation

提示

参考资料

扩展能力

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

See Also

Topics

统计和机器学习工具箱Documentation

万博1manbetx

Mastering Machine Learning: A Step-by-Step Guide with MATLAB

oobPermutedPredictorImportance

Syntax

描述

Input Arguments

Mdl—回归树的随机森林回归BaggedEnsemblemodel object

名称-Value Pair Arguments

“学习者”—用于预测的学习者的指标重视估算1:Mdl.NumTrained（默认）|正整数的数字矢量

'选项'—并行计算选项[]（默认）|structure array returned by实例化

输出参数

偶尔- 禁止袋，以排列值预测重要性估计numeric vector

例子

估计预测因子的重要性

Unbiased Estimates of Predictor Importance Using Parallel Computing

More About

Out-of-Bag, Predictor Importance Estimates by Permutation

提示

参考资料

扩展能力

Automatic Parallel SupportAccelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

See Also

Topics

统计和机器学习工具箱Documentation

万博1manbetx

Mastering Machine Learning: A Step-by-Step Guide with MATLAB

`Mdl`—回归树的随机森林
`回归BaggedEnsemble`model object

`“学习者”`—用于预测的学习者的指标重视估算
`1:Mdl.NumTrained`（默认）|正整数的数字矢量

`'选项'`—并行计算选项
`[]`（默认）|structure array returned by`实例化`

`偶尔`- 禁止袋，以排列值预测重要性估计
numeric vector

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.