regression

logisticLackfit

Description

The logisticLackfit action computes the Hosmer and Lemeshow goodness-of-fit test for a logistic regression model. This test assesses whether the observed event rates match the expected event rates in subgroups of the model population. It is a key diagnostic for evaluating the calibration of a logistic regression model.

regression.logisticLackfit { binEps=double, cutpt=double | {double-1, ...}, df=double, dfReduce=integer, display={display-options}, nGroups=integer, noncentrality=double, outputTables={output-table-options}, powerAdj=boolean, restore={input-table-options}, table={input-table-options} }
Settings
ParameterDescription
binEpsSpecifies the precision of the predicted probabilities that are used for classification. Default: 1E-05.
cutptSpecifies cutpoints for the Hosmer and Lemeshow partitions.
dfSpecifies the degrees of freedom to use for the Hosmer and Lemeshow test.
dfReduceSpecifies the reduction in degrees of freedom for the Hosmer and Lemeshow test. Default: 2.
displaySpecifies a list of results tables to send to the client for display.
nGroupsSpecifies the maximum number of groups to create for the Hosmer and Lemeshow test. Default: 10.
noncentralitySpecifies the noncentrality parameter for the Hosmer and Lemeshow test. Default: 0.
outputTablesLists the names of results tables to save as CAS tables on the server.
powerAdjWhen set to True, adjusts the number of groups so that the Hosmer and Lemeshow test can maintain power. Default: FALSE.
restoreRestores a logistic regression model from a saved item store (a CAS table containing a BLOB) to perform the lack-of-fit test.
tableSpecifies the input data table to be used for the test. This is typically the same data used to fit the model.
Data Preparation View data prep sheet
Data Creation for Goodness-of-Fit Test

This SAS code creates a dataset named 'getheart' with patient information, including a binary outcome 'Status' (Dead or Alive) and several risk factors. This data will first be used to train a logistic regression model, and then to evaluate its goodness-of-fit.

Copied!
1DATA getheart;
2 LENGTH STATUS $ 6;
3 INFILE CARDS;
4 INPUT STATUS $ Age Weight Chol;
5 CARDS;
6Alive 55 180 250
7Dead 60 200 300
8Alive 50 170 220
9Dead 65 210 320
10Alive 45 160 210
11Dead 70 220 350
12Alive 58 185 260
13Dead 62 205 310
14;
15RUN;
16 
17PROC CASUTIL;
18 load DATA=getheart outcaslib='casuser' casout='getheart' replace;
19QUIT;

Examples

This example first fits a logistic regression model on the 'getheart' data to predict 'Status' based on 'Age' and 'Weight', saving the model into a store named 'myModelStore'. It then uses the `logisticLackfit` action to perform the default Hosmer-Lemeshow test on the restored model.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2regression.logistic TABLE='getheart' class={'Status'} model={depvar='Status', effects={'Age', 'Weight'}} store={name='myModelStore', replace=true};
3QUIT;
4PROC CAS;
5regression.logisticLackfit TABLE='getheart' restore='myModelStore';
6QUIT;
Result :
The action returns a 'LackOfFit' table containing the results of the Hosmer-Lemeshow test, including Chi-Square value, degrees of freedom (DF), and the p-value (Pr > ChiSq). A high p-value suggests the model fits the data well.

This example performs the Hosmer-Lemeshow test using a custom number of groups. After fitting the logistic regression model and creating the store 'myModelStore', the `logisticLackfit` action is called with `nGroups=8`, partitioning the data into 8 groups based on predicted probabilities instead of the default 10.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2regression.logistic TABLE='getheart' class={'Status'} model={depvar='Status', effects={'Age', 'Weight', 'Chol'}} store={name='myModelStore', replace=true};
3QUIT;
4PROC CAS;
5regression.logisticLackfit TABLE='getheart' restore='myModelStore' nGroups=8;
6QUIT;
Result :
A 'LackOfFit' table is generated. The test results will be based on 8 groups. The Chi-Square statistic, degrees of freedom, and p-value will reflect this custom grouping, providing a different perspective on the model's calibration.

This example demonstrates using specific cutpoints to define the risk groups for the test. After fitting the model, `logisticLackfit` is called with the `cutpt` parameter, which explicitly defines the upper boundaries of the predicted probability for each group. This allows for a more tailored assessment of fit across specific probability ranges.

SAS® / CAS Code Code awaiting community validation
Copied!
1PROC CAS;
2regression.logistic TABLE='getheart' class={'Status'} model={depvar='Status', effects={'Age', 'Weight', 'Chol'}} store={name='myModelStore', replace=true};
3QUIT;
4PROC CAS;
5regression.logisticLackfit TABLE='getheart' restore='myModelStore' cutpt={0.2, 0.4, 0.6, 0.8};
6QUIT;
Result :
The output 'LackOfFit' table shows the test results based on the 5 groups defined by the specified cutpoints (i.e., [0, 0.2], (0.2, 0.4], (0.4, 0.6], (0.6, 0.8], (0.8, 1]). This provides a granular view of model fit within predefined probability intervals.

FAQ

What is the primary purpose of the logisticLackfit action in SAS Viya?
What is the function of the 'restore' parameter?
How are the groups for the Hosmer and Lemeshow test determined?
What does the 'powerAdj' parameter do?
How can I specify the degrees of freedom for the test statistic?
Which input tables are required for the logisticLackfit action?