Main Content

displaypoints

Return points per predictor per bin

Description

example

PointsInfo= displaypoints(sc)returns a table of points for all bins of all predictor variables used in thecreditscorecardobject after a linear logistic regression model is fit usingfitmodelto the Weight of Evidence data. ThePointsInfotable displays information on the predictor name, bin labels, and the corresponding points per bin.

example

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)returns a table of points for all bins of all predictor variables used in thecreditscorecardobject after a linear logistic regression model is fit (fitmodel) to the Weight of Evidence data. ThePointsInfotable displays information on the predictor name, bin labels, and the corresponding points per bin anddisplaypoints。此外,可选的MinScoreandMaxScorevalues are returned.

example

[PointsInfo,MinScore,MaxScore] = displaypoints(___,Name,Value)specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax.

Examples

collapse all

This example shows how to usedisplaypointsafter a model is fitted to compute the unscaled points per bin, for a given predictor in thecreditscorecardmodel.

Create acreditscorecardobject using theCreditCardData.matfile to load thedata(using a dataset from Refaat 2011). Use the'IDVar'argument in thecreditscorecardfunction to indicate that'CustID'contains ID information and should not be included as a predictor variable.

loadCreditCardDatasc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306 6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078 7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70239 0.064001 10.975 5.0538e-28 CustAge 0.60833 0.24932 2.44 0.014687 ResStatus 1.377 0.65272 2.1097 0.034888 EmpStatus 0.88565 0.293 3.0227 0.0025055 CustIncome 0.70164 0.21844 3.2121 0.0013179 TmWBank 1.1074 0.23271 4.7589 1.9464e-06 OtherCC 1.0883 0.52912 2.0569 0.039696 AMBalance 1.045 0.32214 3.2439 0.0011792 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Display unscaled points for predictors retained in the fitting model.

PointsInfo = displaypoints(sc)
PointsInfo=37×3 table预测本点  ______________ ________________ _________ {'CustAge' } {'[-Inf,33)' } -0.15894 {'CustAge' } {'[33,37)' } -0.14036 {'CustAge' } {'[37,40)' } -0.060323 {'CustAge' } {'[40,46)' } 0.046408 {'CustAge' } {'[46,48)' } 0.21445 {'CustAge' } {'[48,58)' } 0.23039 {'CustAge' } {'[58,Inf]' } 0.479 {'CustAge' } {'' } NaN {'ResStatus' } {'Tenant' } -0.031252 {'ResStatus' } {'Home Owner' } 0.12696 {'ResStatus' } {'Other' } 0.37641 {'ResStatus' } {'' } NaN {'EmpStatus' } {'Unknown' } -0.076317 {'EmpStatus' } {'Employed' } 0.31449 {'EmpStatus' } {'' } NaN {'CustIncome'} {'[-Inf,29000)'} -0.45716 ⋮

displaypointsalways displays a''bin for each predictor. The value of the''bin comes from the initialcreditscorecardobject, and the''bin is set toNaNwhenever the scorecard model has no information on how to assign points to missing data.

To configure the points for the''bin, you must use the initialcreditscorecardobject. For predictors that have missing values in the training set, the points for the''bin are estimated from the data if the'BinMissingData'名称-值对的观点is set totrueusingcreditscorecard。When the'BinMissingData'parameter is set tofalse, or when the data contains no missing values in the training set, use the'Missing'名称-值对的观点informatpointsto indicate how to assign points to the missing data.

Create acreditscorecardobject using theCreditCardData.matfile to load thedatawith missing values.

loadCreditCardData.mathead(dataMissing,5)
ans=5×11 tableCustID CustAge TmAtAddress ResStatus EmpStatus CustIncome TmWBank OtherCC AMBalance UtilRate status ______ _______ ___________ ___________ _________ __________ _______ _______ _________ ________ ______ 1 53 62  Unknown 50000 55 Yes 1055.9 0.22 0 2 61 22 Home Owner Employed 52000 25 Yes 1161.6 0.24 0 3 47 30 Tenant Employed 37000 61 No 877.23 0.29 0 4 NaN 75 Home Owner Employed 53000 20 Yes 157.37 0.08 0 5 68 56 Home Owner Employed 53000 14 Yes 561.84 0.11 0
fprintf('Number of rows: %d\n',height(dataMissing))
Number of rows: 1200
fprintf('Number of missing values CustAge: %d\n',sum(ismissing(dataMissing.CustAge)))
Number of missing values CustAge: 30
fprintf('Number of missing values ResStatus: %d\n',sum(ismissing(dataMissing.ResStatus)))
Number of missing values ResStatus: 40

Usecreditscorecardwith the name-value argument'BinMissingData'set totrueto bin the missing numeric or categorical data in a separate bin. Apply automatic binning.

sc = creditscorecard(dataMissing,'IDVar','CustID','BinMissingData',true); sc = autobinning(sc); disp(sc)
creditscorecard with properties: GoodLabel: 0 ResponseVar: 'status' WeightsVar: '' VarNames: {1x11 cell} NumericPredictors: {1x6 cell} CategoricalPredictors: {'ResStatus' 'EmpStatus' 'OtherCC'} BinMissingData: 1 IDVar: 'CustID' PredictorVars: {1x9 cell} Data: [1200x11 table]

Display and plot bin information for numeric data for'CustAge'that includes missing data in a separate bin labelled

[bi,cp] = bininfo(sc,'CustAge'); disp(bi)
Bin Good Bad Odds WOE InfoValue _____________ ____ ___ ______ ________ __________ {'[-Inf,33)'} 69 52 1.3269 -0.42156 0.018993 {'[33,37)' } 63 45 1.4 -0.36795 0.012839 {'[37,40)' } 72 47 1.5319 -0.2779 0.0079824 {'[40,46)' } 172 89 1.9326 -0.04556 0.0004549 {'[46,48)' } 59 25 2.36 0.15424 0.0016199 {'[48,51)' } 99 41 2.4146 0.17713 0.0035449 {'[51,58)' } 157 62 2.5323 0.22469 0.0088407 {'[58,Inf]' } 93 25 3.72 0.60931 0.032198 {''} 19 11 1.7273 -0.15787 0.00063885 {'Totals' } 803 397 2.0227 NaN 0.087112
plotbins(sc,'CustAge')

Figure contains an axes object. The axes object with title CustAge contains 3 objects of type bar, line. These objects represent Good, Bad.

Display and plot bin information for categorical data for'ResStatus'that includes missing data in a separate bin labelled

[bi,cg] = bininfo(sc,'ResStatus'); disp(bi)
Bin Good Bad Odds WOE InfoValue ______________ ____ ___ ______ _________ __________ {'Tenant' } 296 161 1.8385 -0.095463 0.0035249 {'Home Owner'} 352 171 2.0585 0.017549 0.00013382 {'Other' } 128 52 2.4615 0.19637 0.0055808 {'' } 27 13 2.0769 0.026469 2.3248e-05 {'Totals' } 803 397 2.0227 NaN 0.0092627
plotbins(sc,'ResStatus')

Figure contains an axes object. The axes object with title ResStatus contains 3 objects of type bar, line. These objects represent Good, Bad.

For the'CustAge'and'ResStatus'predictors, there is missing data (NaNsand) in the training data, and the binning process estimates a WOE value of-0.15787and0.026469respectively for missing data in these predictors, as shown above.

Usefitmodelto fit a logistic regression model using Weight of Evidence (WOE) data.fitmodelinternally transforms all the predictor variables into WOE values, using the bins found with the automatic binning process.fitmodelthen fits a logistic regression model using a stepwise method (by default). For predictors that have missing data, there is an explicitbin, with a corresponding WOE value computed from the data. When usingfitmodel, the corresponding WOE value for the bin is applied when performing the WOE transformation.

[sc,mdl] = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1442.8477, Chi2Stat = 4.4974731, PValue = 0.033944979 6. Adding ResStatus, Deviance = 1438.9783, Chi2Stat = 3.86941, PValue = 0.049173805 7. Adding OtherCC, Deviance = 1434.9751, Chi2Stat = 4.0031966, PValue = 0.045414057 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70229 0.063959 10.98 4.7498e-28 CustAge 0.57421 0.25708 2.2335 0.025513 ResStatus 1.3629 0.66952 2.0356 0.04179 EmpStatus 0.88373 0.2929 3.0172 0.002551 CustIncome 0.73535 0.2159 3.406 0.00065929 TmWBank 1.1065 0.23267 4.7556 1.9783e-06 OtherCC 1.0648 0.52826 2.0156 0.043841 AMBalance 1.0446 0.32197 3.2443 0.0011775 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 88.5, p-value = 2.55e-16

Display unscaled points for predictors retained in the fitting model (to scale points useformatpoints).

PointsInfo = displaypoints(sc)
PointsInfo=38×3 tablePredictors Bin Points _____________ ______________ _________ {'CustAge' } {'[-Inf,33)' } -0.14173 {'CustAge' } {'[33,37)' } -0.11095 {'CustAge' } {'[37,40)' } -0.059244 {'CustAge' } {'[40,46)' } 0.074167 {'CustAge' } {'[46,48)' } 0.1889 {'CustAge' } {'[48,51)' } 0.20204 {'CustAge' } {'[51,58)' } 0.22935 {'CustAge' } {'[58,Inf]' } 0.45019 {'CustAge' } {'' } 0.0096749 {'ResStatus'} {'Tenant' } -0.029778 {'ResStatus'} {'Home Owner'} 0.12425 {'ResStatus'} {'Other' } 0.36796 {'ResStatus'} {'' } 0.1364 {'EmpStatus'} {'Unknown' } -0.075948 {'EmpStatus'} {'Employed' } 0.31401 {'EmpStatus'} {'' } NaN ⋮

Notice that points for thebin forCustAgeandResStatusare explicitly shown. These points are computed from the WOE value for the bin and the logistic model coefficients.

For predictors that have no missing data in the training set, there is no explicitbin, and by default the points are set toNaNfor missing data, and they lead to a score ofNaNwhen runningscore。For predictors that have no explicitbin, use the name-value argument'Missing'informatpointsto indicate how missing data should be treated for scoring purposes.

This example shows how to useformatpointsafter a model is fitted to format scaled points, and then usedisplaypointsto display the scaled points per bin, for a given predictor in thecreditscorecardmodel.

Points become scaled when a range is defined. Specifically, a linear transformation from the unscaled to the scaled points is necessary. This transformation is defined either by supplying a shift and slope or by specifying the worst and best scores possible. (For more information, seeformatpoints。)

Create acreditscorecardobject using theCreditCardData.matfile to load thedata(using a dataset from Refaat 2011). Use the'IDVar'argument in thecreditscorecardfunction to indicate that'CustID'contains ID information and should not be included as a predictor variable.

loadCreditCardDatasc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306 6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078 7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70239 0.064001 10.975 5.0538e-28 CustAge 0.60833 0.24932 2.44 0.014687 ResStatus 1.377 0.65272 2.1097 0.034888 EmpStatus 0.88565 0.293 3.0227 0.0025055 CustIncome 0.70164 0.21844 3.2121 0.0013179 TmWBank 1.1074 0.23271 4.7589 1.9464e-06 OtherCC 1.0883 0.52912 2.0569 0.039696 AMBalance 1.045 0.32214 3.2439 0.0011792 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Use theformatpointsfunction to scale providing the'Worst'and'Best'score values. The range provided below is a common score range.

sc = formatpoints(sc,'WorstAndBestScores',[300 850]);

Display the points information again to verify that the points are now scaled and also display the scaled minimum and maximum scores.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=37×3 table预测本点  ______________ ________________ ______ {'CustAge' } {'[-Inf,33)' } 46.396 {'CustAge' } {'[33,37)' } 48.727 {'CustAge' } {'[37,40)' } 58.772 {'CustAge' } {'[40,46)' } 72.167 {'CustAge' } {'[46,48)' } 93.256 {'CustAge' } {'[48,58)' } 95.256 {'CustAge' } {'[58,Inf]' } 126.46 {'CustAge' } {'' } NaN {'ResStatus' } {'Tenant' } 62.421 {'ResStatus' } {'Home Owner' } 82.276 {'ResStatus' } {'Other' } 113.58 {'ResStatus' } {'' } NaN {'EmpStatus' } {'Unknown' } 56.765 {'EmpStatus' } {'Employed' } 105.81 {'EmpStatus' } {'' } NaN {'CustIncome'} {'[-Inf,29000)'} 8.9706 ⋮
MinScore = 300.0000
MaxScore = 850.0000

Notice that, as expected, the values ofMinScoreandMaxScorecorrespond to the worst and best possible scores.

This example shows how to usedisplaypointsafter a model is fitted to separate the base points from the rest of the points assigned to each predictor variable. The name-value pair argument'BasePoints'in theformatpointsfunction is a boolean that serves this purpose. By default, the base points are spread across all variables in the scorecard.

Create acreditscorecardobject using theCreditCardData.matfile to load thedata(using a dataset from Refaat 2011). Use the'IDVar'argument in thecreditscorecardfunction to indicate that'CustID'contains ID information and should not be included as a predictor variable.

loadCreditCardDatasc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306 6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078 7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70239 0.064001 10.975 5.0538e-28 CustAge 0.60833 0.24932 2.44 0.014687 ResStatus 1.377 0.65272 2.1097 0.034888 EmpStatus 0.88565 0.293 3.0227 0.0025055 CustIncome 0.70164 0.21844 3.2121 0.0013179 TmWBank 1.1074 0.23271 4.7589 1.9464e-06 OtherCC 1.0883 0.52912 2.0569 0.039696 AMBalance 1.045 0.32214 3.2439 0.0011792 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Use theformatpointsfunction to separate the base points by providing the'BasePoints'名称-值对的观点。

sc = formatpoints(sc,'BasePoints',true);

Display the base points, separated out from the other points, for predictors retained in the fitting model.

PointsInfo = displaypoints(sc)
PointsInfo=38×3 table预测本点  ______________ ______________ _________ {'BasePoints'} {'BasePoints'} 0.70239 {'CustAge' } {'[-Inf,33)' } -0.25928 {'CustAge' } {'[33,37)' } -0.24071 {'CustAge' } {'[37,40)' } -0.16066 {'CustAge' } {'[40,46)' } -0.053933 {'CustAge' } {'[46,48)' } 0.11411 {'CustAge' } {'[48,58)' } 0.13005 {'CustAge' } {'[58,Inf]' } 0.37866 {'CustAge' } {'' } NaN {'ResStatus' } {'Tenant' } -0.13159 {'ResStatus' } {'Home Owner'} 0.026616 {'ResStatus' } {'Other' } 0.27607 {'ResStatus' } {'' } NaN {'EmpStatus' } {'Unknown' } -0.17666 {'EmpStatus' } {'Employed' } 0.21415 {'EmpStatus' } {'' } NaN ⋮

This example shows how to usedisplaypointsafter a model is fitted and themodifybinsfunction is used to provide user-defined bin labels for a numeric predictor.

Create acreditscorecardobject using theCreditCardData.matfile to load thedata(using a dataset from Refaat 2011). Use the'IDVar'argument in thecreditscorecardfunction to indicate that'CustID'contains ID information and should not be included as a predictor variable.

loadCreditCardDatasc = creditscorecard(data,'IDVar','CustID');

Perform automatic binning to bin for all predictors.

sc = autobinning(sc);

Fit a linear regression model using default parameters.

sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306 6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078 7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70239 0.064001 10.975 5.0538e-28 CustAge 0.60833 0.24932 2.44 0.014687 ResStatus 1.377 0.65272 2.1097 0.034888 EmpStatus 0.88565 0.293 3.0227 0.0025055 CustIncome 0.70164 0.21844 3.2121 0.0013179 TmWBank 1.1074 0.23271 4.7589 1.9464e-06 OtherCC 1.0883 0.52912 2.0569 0.039696 AMBalance 1.045 0.32214 3.2439 0.0011792 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Use thedisplaypointsfunction to display point information.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=37×3 table预测本点  ______________ ________________ _________ {'CustAge' } {'[-Inf,33)' } -0.15894 {'CustAge' } {'[33,37)' } -0.14036 {'CustAge' } {'[37,40)' } -0.060323 {'CustAge' } {'[40,46)' } 0.046408 {'CustAge' } {'[46,48)' } 0.21445 {'CustAge' } {'[48,58)' } 0.23039 {'CustAge' } {'[58,Inf]' } 0.479 {'CustAge' } {'' } NaN {'ResStatus' } {'Tenant' } -0.031252 {'ResStatus' } {'Home Owner' } 0.12696 {'ResStatus' } {'Other' } 0.37641 {'ResStatus' } {'' } NaN {'EmpStatus' } {'Unknown' } -0.076317 {'EmpStatus' } {'Employed' } 0.31449 {'EmpStatus' } {'' } NaN {'CustIncome'} {'[-Inf,29000)'} -0.45716 ⋮
MinScore = -1.3100
MaxScore = 3.0726

Use themodifybinsfunction to specify user-defined bin labels for'CustAge'so that the bin ranges are described in natural language.

labels = {'Up to 32','33 to 36','37 to 39','40 to 45','46 to 47','48 to 57','At least 58'}; sc = modifybins(sc,'CustAge','BinLabels',labels);

Rerundisplaypointsto verify the updated bin labels.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc)
PointsInfo=37×3 table预测本点  ______________ ________________ _________ {'CustAge' } {'Up to 32' } -0.15894 {'CustAge' } {'33 to 36' } -0.14036 {'CustAge' } {'37 to 39' } -0.060323 {'CustAge' } {'40 to 45' } 0.046408 {'CustAge' } {'46 to 47' } 0.21445 {'CustAge' } {'48 to 57' } 0.23039 {'CustAge' } {'At least 58' } 0.479 {'CustAge' } {'' } NaN {'ResStatus' } {'Tenant' } -0.031252 {'ResStatus' } {'Home Owner' } 0.12696 {'ResStatus' } {'Other' } 0.37641 {'ResStatus' } {'' } NaN {'EmpStatus' } {'Unknown' } -0.076317 {'EmpStatus' } {'Employed' } 0.31449 {'EmpStatus' } {'' } NaN {'CustIncome'} {'[-Inf,29000)'} -0.45716 ⋮
MinScore = -1.3100
MaxScore = 3.0726

This example shows how to use a credit scorecard to compute the weights of the predictors. The weights of the predictors are determined from the range of points of each predictor, divided by the total range of points for the scorecard. The points for the scorecard not only take into consideration the betas, but also implicitly the binning of the predictor values and the corresponding weights of evidence.

Create a scorecard.

loadCreditCardData.matsc = creditscorecard(data,'IDVar','CustID'); sc = autobinning(sc); sc = fitmodel(sc);
1. Adding CustIncome, Deviance = 1490.8527, Chi2Stat = 32.588614, PValue = 1.1387992e-08 2. Adding TmWBank, Deviance = 1467.1415, Chi2Stat = 23.711203, PValue = 1.1192909e-06 3. Adding AMBalance, Deviance = 1455.5715, Chi2Stat = 11.569967, PValue = 0.00067025601 4. Adding EmpStatus, Deviance = 1447.3451, Chi2Stat = 8.2264038, PValue = 0.0041285257 5. Adding CustAge, Deviance = 1441.994, Chi2Stat = 5.3511754, PValue = 0.020708306 6. Adding ResStatus, Deviance = 1437.8756, Chi2Stat = 4.118404, PValue = 0.042419078 7. Adding OtherCC, Deviance = 1433.707, Chi2Stat = 4.1686018, PValue = 0.041179769 Generalized linear regression model: status ~ [Linear formula with 8 terms in 7 predictors] Distribution = Binomial Estimated Coefficients: Estimate SE tStat pValue ________ ________ ______ __________ (Intercept) 0.70239 0.064001 10.975 5.0538e-28 CustAge 0.60833 0.24932 2.44 0.014687 ResStatus 1.377 0.65272 2.1097 0.034888 EmpStatus 0.88565 0.293 3.0227 0.0025055 CustIncome 0.70164 0.21844 3.2121 0.0013179 TmWBank 1.1074 0.23271 4.7589 1.9464e-06 OtherCC 1.0883 0.52912 2.0569 0.039696 AMBalance 1.045 0.32214 3.2439 0.0011792 1200 observations, 1192 error degrees of freedom Dispersion: 1 Chi^2-statistic vs. constant model: 89.7, p-value = 1.4e-16

Compute scorecard points and theMinPtsandMaxPtsscores.

sc = formatpoints(sc,'PointsOddsAndPDO',[500 2 50]); [PointsTable,MinPts,MaxPts] = displaypoints(sc); PtsRange = MaxPts-MinPts; disp(PointsTable(1:10,:));
Predictors Bin Points _____________ ______________ ______ {'CustAge' } {'[-Inf,33)' } 52.821 {'CustAge' } {'[33,37)' } 54.161 {'CustAge' } {'[37,40)' } 59.934 {'CustAge' } {'[40,46)' } 67.633 {'CustAge' } {'[46,48)' } 79.755 {'CustAge' } {'[48,58)' } 80.905 {'CustAge' } {'[58,Inf]' } 98.838 {'CustAge' } {'' } NaN {'ResStatus'} {'Tenant' } 62.031 {'ResStatus'} {'Home Owner'} 73.444
fprintf('Min points: %g, Max points: %g\n',MinPts,MaxPts);
Min points: 355.505, Max points: 671.64

Compute the predictor weights.

Predictor = unique(PointsTable.Predictors,'stable'); NumPred = length(Predictor); Weight = zeros(NumPred,1);forii=1:NumPred Ind = cellfun(@(x)strcmpi(Predictor{ii},x),PointsTable.Predictors); MaxPtsPred = max(PointsTable.Points(Ind)); MinPtsPred = min(PointsTable.Points(Ind)); Weight(ii) = 100*(MaxPtsPred-MinPtsPred)/PtsRange;endPredictorWeights = table(Predictor,Weight); PredictorWeights(end+1,:) = PredictorWeights(end,:); PredictorWeights.Predictor{end} ='Total'; PredictorWeights.Weight(end) = sum(Weight); disp(PredictorWeights)
Predictor Weight ______________ ______ {'CustAge' } 14.556 {'ResStatus' } 9.302 {'EmpStatus' } 8.9174 {'CustIncome'} 20.401 {'TmWBank' } 25.884 {'OtherCC' } 7.9885 {'AMBalance' } 12.951 {'Total' } 100

The weights are defined as the range of points for the predictor divided by the range of points for the scorecard.

To create acreditscorecardobject using theCreditCardData.matfile, load thedata(using a dataset from Refaat 2011). Using thedataMissingdataset, set the'BinMissingData'indicator totrue

loadCreditCardData.matsc = creditscorecard(dataMissing,'BinMissingData',true);

Useautobinningwith thecreditscorecardobject.

sc = autobinning(sc);

The binning map or rules for categorical data are summarized in a "category grouping" table, returned as an optional output. By default, each category is placed in a separate bin. Here is the information for the predictorResStatus

[bi,cg] = bininfo(sc,'ResStatus')
bi=5×6 tableBin Good Bad Odds WOE InfoValue ______________ ____ ___ ______ _________ __________ {'Tenant' } 296 161 1.8385 -0.095463 0.0035249 {'Home Owner'} 352 171 2.0585 0.017549 0.00013382 {'Other' } 128 52 2.4615 0.19637 0.0055808 {'' } 27 13 2.0769 0.026469 2.3248e-05 {'Totals' } 803 397 2.0227 NaN 0.0092627
cg=3×2 tableCategory BinNumber ______________ _________ {'Tenant' } 1 {'Home Owner'} 2 {'Other' } 3

To group categories'Tenant'and'Other', modify the category grouping tablecg, so the bin number for'Other'is the same as the bin number for'Tenant'。Then usemodifybinsto update thecreditscorecardobject.

cg.BinNumber(3) = 2; sc = modifybins(sc,'ResStatus','Catg',cg);

Display the updated bin information usingbininfo。Note that the bin labels has been updated and that the bin membership information is contained in the category groupingcg

[bi,cg] = bininfo(sc,'ResStatus')
bi=4×6 tableBin Good Bad Odds WOE InfoValue _____________ ____ ___ ______ _________ __________ {'Group1' } 296 161 1.8385 -0.095463 0.0035249 {'Group2' } 480 223 2.1525 0.062196 0.0022419 {''} 27 13 2.0769 0.026469 2.3248e-05 {'Totals' } 803 397 2.0227 NaN 0.00579
cg=3×2 tableCategory BinNumber ______________ _________ {'Tenant' } 1 {'Home Owner'} 2 {'Other' } 2

Useformatpointswith the'Missing'名称-值对的观点to indicate that missing data is assigned'maxpoints'

sc = formatpoints(sc,'BasePoints',true,'Missing','maxpoints','WorstAndBest',[300 800]);

Usefitmodelto fit the model.

sc = fitmodel(sc,'VariableSelection','fullmodel','Display','Off');

Then usedisplaypoints(Risk Management Toolbox)with thecreditscorecardobject to return a table of points for all bins of all predictor variables used in thecompactCreditScorecardobject. By setting thedisplaypoints(Risk Management Toolbox)名称-值对的观点for 'ShowCategoricalMembers'totrue, all the members contained in each individual group are displayed.

[PointsInfo,MinScore,MaxScore] = displaypoints(sc,'ShowCategoricalMembers',true)
PointsInfo=51×3 tablePredictors Bin Points _______________ ______________ _______ {'BasePoints' } {'BasePoints'} 535.25 {'CustID' } {'[-Inf,121)'} 12.085 {'CustID' } {'[121,241)' } 5.4738 {'CustID' } {'[241,1081)'} -1.4061 {'CustID' } {'[1081,Inf]'} -7.2217 {'CustID' } {'' } 12.085 {'CustAge' } {'[-Inf,33)' } -25.973 {'CustAge' } {'[33,37)' } -22.67 {'CustAge' } {'[37,40)' } -17.122 {'CustAge' } {'[40,46)' } -2.8071 {'CustAge' } {'[46,48)' } 9.5034 {'CustAge' } {'[48,51)' } 10.913 {'CustAge' } {'[51,58)' } 13.844 {'CustAge' } {'[58,Inf]' } 37.541 {'CustAge' } {'' } -9.7271 {'TmAtAddress'} {'[-Inf,23)' } -9.3683 ⋮
MinScore = 300
MaxScore = 800.0000

Input Arguments

collapse all

Credit scorecard model, specified as acreditscorecardobject. Usecreditscorecardto create acreditscorecardobject.

Name-Value Arguments

Specify optional comma-separated pairs ofName,Valuearguments.Nameis the argument name andValueis the corresponding value.Namemust appear inside quotes. You can specify several name and value pair arguments in any order asName1,Value1,...,NameN,ValueN

Example:[PointsInfo,MinScore,MaxScore] = displaypoints(sc,‘ShowCategoricalMembers’,true)

目前的指标如何显示箱标签ries that were grouped together, specified as the comma-separated pair consisting of'ShowCategoricalMembers'and a logical scalar with a value oftrueorfalse

By default, when'ShowCategoricalMembers'isfalse, bin labels are displayed asGroup1,Group2,…,Groupn, or if the bin labels were modified increditscorecard, then the user-defined bin label names are displayed.

If'ShowCategoricalMembers'istrue, all the members contained in each individual group are displayed.

Data Types:logical

Output Arguments

collapse all

One row per bin, per predictor, with the corresponding points, returned as a table. For example:

Predictors Bin Points
Predictor_1 Bin_11 Points_11
Predictor_1 Bin_12 Points_12
Predictor_1 Bin_13 Points_13
。.. 。..
Predictor_1 '' NaN(Default)
Predictor_2 Bin_21 Points_21
Predictor_2 Bin_22 Points_22
Predictor_2 Bin_23 Points_23
。.. 。..
Predictor_2 '' NaN(Default)
Predictor_j Bin_ji Points_ji
。.. 。..
Predictor_j '' NaN(Default)

displaypointsalways displays a''bin for each predictor. The value of the''bin comes from the initialcreditscorecardobject, and the''bin is set toNaNwhenever the scorecard model has no information on how to assign points to missing data.

To configure the points for the''bin, you must use the initialcreditscorecardobject. For predictors that have missing values in the training set, the points for the''bin are estimated from the data if the'BinMissingData'名称-值对的观点for is set totrueusingcreditscorecard。When the'BinMissingData'parameter is set tofalse, or when the data contains no missing values in the training set, use the'Missing'名称-值对的观点informatpointsto indicate how to assign points to the missing data.

Another option is to usefillmissingto specify replacement "fill" values for predictors with aNaNorvalue. If you usefillmissing, then thedisplaypoints''row has the same points as the bin associated with the fill value.

When base points are reported separately (seeformatpoints), the first row of the returnedPointsInfotable contains the base points.

Minimum possible total score, returned as a scalar.

Note

Minimum score is the lowest possible total score in the mathematical sense, independently of whether a low score means high risk or low risk.

Maximum possible total score, returned as a scalar.

Note

Maximum score is the highest possible total score in the mathematical sense, independently of whether a high score means high risk or low risk.

Algorithms

The points for predictorjand biniare, by default, given by

Points_ji = (Shift + Slope*b0)/p + Slope*(bj*WOEj(i))
wherebjis the model coefficient of predictorj,pis the number of predictors in the model, and WOEj(i) is the Weight of Evidence (WOE) value for thei-th bin corresponding to thej-th model predictor.ShiftandSlopeare scaling constants.

When the base points are reported separately (see theformatpoints名称-值对的观点BasePoints), the base points are given by

Base Points = Shift + Slope*b0,
and the points for thej-th predictor,ith行给出的
Points_ji = Slope*(bj*WOEj(i))).

By default, the base points are not reported separately.

The minimum and maximum scores are:

MinScore = Shift + Slope*b0 + min(Slope*b1*WOE1) + ... +min(Slope*bp*WOEp)), MaxScore = Shift + Slope*b0 + max(Slope*b1*WOE1) + ... +max(Slope*bp*WOEp)).

Useformatpointsto control the way points are scaled, rounded, and whether the base points are reported separately. Seeformatpointsfor more information on format parameters and for details and formulas on these formatting options.

References

[1] Anderson, R.The Credit Scoring Toolkit.Oxford University Press, 2007.

[2] Refaat, M.Credit Risk Scorecards: Development and Implementation Using SAS.lulu.com, 2011.

Introduced in R2014b