Main Content

predict

Class:GeneralizedLinearMixedModel

Predict response of generalized linear mixed-effects model

Description

例子

ypred=predict(Glme返回响应的预测条件均值,ypred,,,,using the original predictor values used to fit the generalized linear mixed-effects modelGlme

例子

ypred=predict(Glme,,,,tblnew返回预测的条件均值使用在tblnew

如果a grouping variable intblnew具有原始数据中没有的级别,那么该分组变量的随机效果并不促进“有条件”在分组变量具有新级别的观测值中的预测。

ypred=predict(___,,,,名称,价值使用一个或多个指定的其他选项返回响应的预测条件均值名称,价值pair arguments. For example, you can specify the confidence level, simultaneous confidence bounds, or contributions from only fixed effects. You can use any of the input arguments in the previous syntaxes.

[[ypred,,,,ypredCI] = predict(___还返回95%的点置信区间,ypredCI,,,,for each predicted value.

[[ypred,,,,ypredCI,,,,DF] = predict(___also returns the degrees of freedom,DF,,,,used to compute the confidence intervals.

输入参数

expand all

Generalized linear mixed-effects model, specified as aGeneralizedLinearMixedModel目的。For properties and methods of this object, seeGeneralizedLinearMixedModel

新输入数据,其中包括响应变量,预测变量和grouping variables,指定为表或数据集数组。预测变量可以是连续的或分组变量。tblnewmust have the same variables as the original table or dataset array used infitglme适合广义线性混合效应模型Glme

名称值参数

Specify optional pairs of arguments asname1=Value1,...,NameN=ValueN,,,,wherename是参数名称和Value是the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

before R2021a, use commas to separate each name and value, and enclosename一世nquotes.

显着性水平,指定为逗号分隔对,由'Alpha'and a scalar value in the range [0,1]. For a value α, the confidence level is 100 × (1 – α)%.

例如,对于99%的置信区间,您可以按以下方式指定置信度。

Example:'Alpha',0.01

Data Types:single|double

有条件预测的指标,,,,specified as the comma-separated pair consisting of“有条件”以及以下内容之一。

Value Description
true 固定效应和随机效应(条件)的贡献
错误的 Contribution from only fixed effects (marginal)

Example:“有条件”,,,,错误的

Method for computing approximate degrees of freedom, specified as the comma-separated pair consisting of'dfmethod'以及以下内容之一。

Value Description
'residual' The degrees of freedom value is assumed to be constant and equal ton-p,,,,wheren是the number of observations andp是固定效果的数量。
'没有任何' The degrees of freedom is set to infinity.

Example:'dfmethod',,,,'没有任何'

Model offset, specified as a vector of scalar values of lengthm,,,,wherem是the number of rows intblnew。偏移用作附加预测指标,并具有固定的系数值1

Type of confidence bounds, specified as the comma-separated pair consisting of'Simultaneous'and either错误的ortrue

  • 如果'Simultaneous'错误的, 然后predict计算非同步置信度范围。

  • 如果'Simultaneous'true,,,,predictreturns simultaneous confidence bounds.

Example:'Simultaneous',true

Output Arguments

expand all

预测的响应,作为向量返回。如果是“有条件”名称值对参数指定为true,,,,ypred包含对随机效应的响应条件均值的预测。条件预测包括固定和随机效应的贡献。边际预测仅包括固定效应的贡献。

To compute marginal predictions,predictcomputes conditional predictions, but substitutes a vector of zeros in place of the empirical Bayes predictors (EBPs) of the random effects.

Point-wise confidence intervals for the predicted values, returned as a two-column matrix. The first column ofypredCIcontains the lower bound, and the second column contains the upper bound. By default,ypredCIcontains the 95% nonsimultaneous confidence intervals for the predictions. You can change the confidence level using the一个lpha名称值对参数,并使用同时名称值对参数。

When fitting a GLME model usingfitglmeand one of the maximum likelihood fit methods ('Laplace'or'),predict使用条件均值预测误差(CMSEP)方法在估计的协方差参数和观察到的响应中计算置信区间。另外,您可以将置信区间解释为近似贝叶斯可靠的间隔,以估计的协方差参数和观察到的响应为条件。

When fitting a GLME model usingfitglmeand one of the pseudo likelihood fit methods ('mpl'or'REMPL'),predict根据最终伪可能迭代的拟合线性混合效应模型的计算基础。

计算置信区间的自由度,作为向量或标量值返回。

  • 如果'Simultaneous'错误的, 然后DF是a vector.

  • 如果'Simultaneous'true, 然后DF是a scalar value.

Examples

expand all

Load the sample data.

加载mfr

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

  • 标志指示批处理是否使用新过程(newprocess

  • Processing time for each batch, in hours (time

  • 批次的温度,摄氏度(摄氏度)(temp

  • 表明供应商的分类变量(一个,,,,b,,,,orC批处理中使用的化学物质(供应商

  • 批处理中的缺陷次数(缺陷

The data also includestime_devandtemp_dev,,,,which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model usingnewprocess,,,,time_dev,,,,temp_dev,,,,and供应商as fixed-effects predictors. Include a random-effects term for intercept grouped byfactory,以说明由于特定于工厂特定的变化而可能存在的质量差异。响应变量缺陷has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as“代用ts',因此虚拟变量系数总和为0。

The number of defects can be modeled using a Poisson distribution:

缺陷 一世 j Poisson (( μ 一世 j

This corresponds to the generalized linear mixed-effects model

log (( μ 一世 j = β 0 + β 1 newprocess 一世 j + β 2 time _ 开发 一世 j + β 3 temp _ 开发 一世 j + β 4 供应商 _ C 一世 j + β 5 供应商 _ b 一世 j + b 一世 ,,,,

where

  • 缺陷 一世 j 是在工厂产生的批处理中观察到的缺陷次数 一世 during batch j

  • μ 一世 j 是对应于工厂的平均缺陷数量 一世 ((where 一世 = 1 ,,,, 2 ,,,, ,,,, 2 0 )在批处理期间 j ((where j = 1 ,,,, 2 ,,,, ,,,, 5 )。

  • newprocess 一世 j ,,,, time _ 开发 一世 j ,,,,and temp _ 开发 一世 j 是与工厂相对应的每个变量的测量 一世 during batch j 。例如, newprocess 一世 j 指示工厂生产的批处理 一世 during batch j used the new process.

  • 供应商 _ C 一世 j and 供应商 _ b 一世 j 是使用效果(总和到零)编码的虚拟变量来指示公司是否是否Corb,,,,respectively, supplied the process chemicals for the batch produced by factory 一世 during batch j

  • b 一世 n (( 0 ,,,, σ b 2 是每个工厂的随机效应截距 一世 这说明了特定于工厂特定的质量变化。

Glme=fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)',,,,'Distribution',,,,'Poisson',,,,'Link',,,,'日志',,,,'FitMethod',,,,'Laplace',,,,'DummyVarCoding',,,,“代用ts');

预测原始设计值的响应值。显示前十个预测以及观察到的响应值。

ypred =预测(glme); [ypred(1:10),mfr.defects(1:10)]
ans =10×24。9883 6.0000 5.9423 7.0000 5.1318 6.0000 5.6295 5.0000 5.3499 6.0000 5.2134 5.0000 4.6430 4.0000 4.5342 4.0000 5.3903 9.0000 4.6529 4.0000

Column 1 contains the predicted response values at the original design values. Column 2 contains the observed response values.

Load the sample data.

加载mfr

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

  • 标志指示批处理是否使用新过程(newprocess

  • Processing time for each batch, in hours (time

  • 批次的温度,摄氏度(摄氏度)(temp

  • 表明供应商的分类变量(一个,,,,b,,,,orC批处理中使用的化学物质(供应商

  • 批处理中的缺陷次数(缺陷

The data also includestime_devandtemp_dev,,,,which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model usingnewprocess,,,,time_dev,,,,temp_dev,,,,and供应商as fixed-effects predictors. Include a random-effects term for intercept grouped byfactory,以说明由于特定于工厂特定的变化而可能存在的质量差异。响应变量缺陷has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as“代用ts',因此虚拟变量系数总和为0。

The number of defects can be modeled using a Poisson distribution:

缺陷 一世 j Poisson (( μ 一世 j

This corresponds to the generalized linear mixed-effects model

log (( μ 一世 j = β 0 + β 1 newprocess 一世 j + β 2 time _ 开发 一世 j + β 3 temp _ 开发 一世 j + β 4 供应商 _ C 一世 j + β 5 供应商 _ b 一世 j + b 一世 ,,,,

where

  • 缺陷 一世 j 是在工厂产生的批处理中观察到的缺陷次数 一世 during batch j

  • μ 一世 j 是对应于工厂的平均缺陷数量 一世 ((where 一世 = 1 ,,,, 2 ,,,, ,,,, 2 0 )在批处理期间 j ((where j = 1 ,,,, 2 ,,,, ,,,, 5 )。

  • newprocess 一世 j ,,,, time _ 开发 一世 j ,,,,and temp _ 开发 一世 j 是与工厂相对应的每个变量的测量 一世 during batch j 。例如, newprocess 一世 j 指示工厂生产的批处理 一世 during batch j used the new process.

  • 供应商 _ C 一世 j and 供应商 _ b 一世 j 是使用效果(总和到零)编码的虚拟变量来指示公司是否是否Corb,,,,respectively, supplied the process chemicals for the batch produced by factory 一世 during batch j

  • b 一世 n (( 0 ,,,, σ b 2 是每个工厂的随机效应截距 一世 这说明了特定于工厂特定的质量变化。

Glme=fitglme(mfr,'defects ~ 1 + newprocess + time_dev + temp_dev + supplier + (1|factory)',,,,'Distribution',,,,'Poisson',,,,'Link',,,,'日志',,,,'FitMethod',,,,'Laplace',,,,'DummyVarCoding',,,,“代用ts');

预测原始设计值的响应值。

ypred =预测(glme);

Create a new table by copying the first 10 rows ofmfr一世ntotblnew

tblNew = mfr(1:10,:);

The first 10 rows ofmfr一世nclude data collected from trials 1 through 5 for factories 1 and 2. Both factories used the old process for all of their trials during the experiment, sonewProcess = 0对于所有10个观察结果。

Change the value ofnewprocessto1对于观察tblnew

tblnew.newprocess = ones(height(tblnew),1);

Compute predicted response values and nonsimultaneous 99% confidence intervals usingtblnew。Display the first 10 rows of the predicted values based ontblnew,基于mfr,以及观察到的响应值。

[[ypred_new,ypredCI] = predict(glme,tblnew,'Alpha',0.01);[ypred_new,ypred(1:10),mfr.defects(1:10)]
ans =10×33。4536 4.9883 6.0000 4.1142 5.9423 7.0000 3.5530 5.1318 6.0000 3.8976 5.6295 5.0000 3.7040 5.3499 6.0000 3.6095 5.2134 5.0000 3.2146 4.6430 4.0000 3.1393 4.5342 4.0000 3.7320 5.3903 9.0000 3.2214 4.6529 4.0000

第1列包含基于数据的预测响应值tblnew,,,,wherenewProcess = 1。Column 2 contains predicted response values based on the original data inmfr,,,,wherenewProcess = 0。Column 3 contains the observed response values inmfr。based on these results, if all other predictors retain their original values, the predicted number of defects appears to be smaller when using the new process.

Display the 99% confidence intervals for rows 1 through 10 corresponding to the new predicted response values.

ypredCI(1:10,1:2)
ans =10×21.6983 7.0235 1.9191 8.8201 1.8735 6.7380 2.0149 7。5395 1.9034 7.2079 1.8918 6.8871 1.6776 6.1597 1.5404 6.3976 1.9574 7.1154 1.6892 6.1436

References

[[1] Booth, J.G., and J.P. Hobert. “Standard Errors of Prediction in Generalized Linear Mixed Models.”Journal of the American Statistical Association,,,,Vol. 93, 1998, pp. 262–272.