The countregFitModel action analyzes regression models in which the dependent variable takes nonnegative integer or count values. These models are often used to represent the number of times an event occurs. This action supports various models including Poisson, Negative Binomial (NB1 and NB2), and their zero-inflated versions (ZIP, ZINB, ZICMP), as well as Conway-Maxwell-Poisson (CMP) models. It is suitable for both cross-sectional and panel data.
| Parameter | Description |
|---|---|
| bayes | Specifies the options to use for Bayesian analysis. |
| bounds | Imposes simple boundary constraints on the parameter estimates. |
| class | Specifies the classification variables. |
| model | Specifies the dependent variable and independent regressor variables for the regression model. |
| dispmodel | Specifies the dispersion-related regressors for Conway-Maxwell-Poisson models. |
| zeromodel | Specifies the zero-inflated regressors that determine the probability of a zero count. |
| table | Specifies the input data table. |
| freq | Specifies the observation frequency variable. |
| weight | Specifies the observation weight variable. |
| groupid | Specifies an identification variable for panel data models. |
| offset | Specifies the variable to use as an offset in the model. |
| selection | Specifies the model selection method (e.g., FORWARD, BACKWARD, LASSO). |
| store | Stores the estimated model to an item store for later use. |
| output | Specifies an output data table to contain various statistics like predicted values and residuals. |
This SAS code generates a sample dataset named 'd_counts' in the 'mycas' caslib. The dataset contains a count variable 'NumInjur' representing the number of injuries, and several explanatory variables ('Sex', 'Age', 'x1', 'x2') that can be used to model the count outcome.
| 1 | DATA mycas.d_counts; |
| 2 | call streaminit(12345); |
| 3 | DO i = 1 to 1000; |
| 4 | Sex = ifn(rand('UNIFORM') > 0.5, 'F', 'M'); |
| 5 | Age = 20 + floor(rand('UNIFORM') * 40); |
| 6 | x1 = rand('UNIFORM'); |
| 7 | x2 = rand('NORMAL'); |
| 8 | lambda = exp(-1 + 0.5 * (Sex='M') + 0.02 * Age + 0.1 * x1 + 0.3 * x2); |
| 9 | NumInjur = rand('POISSON', lambda); |
| 10 | OUTPUT; |
| 11 | END; |
| 12 | RUN; |
This example performs a simple Poisson regression to model the number of injuries ('NumInjur') based on the predictors 'Sex', 'Age', and 'x1'.
| 1 | PROC CAS; |
| 2 | countreg.countregFitModel / |
| 3 | TABLE={name='d_counts'}, |
| 4 | class={'Sex'}, |
| 5 | model={ |
| 6 | depVars={{name='NumInjur'}}, |
| 7 | effects={{vars={'Sex', 'Age', 'x1'}}} |
| 8 | }; |
| 9 | RUN; |
This example fits a Zero-Inflated Negative Binomial (ZINB) model. The count model for 'NumInjur' includes 'x1' and 'x2'. The zero-inflation model, which models the probability of excess zeros, includes the 'Sex' variable. This is useful when the data has more zeros than a standard Poisson or Negative Binomial model would predict.
| 1 | PROC CAS; |
| 2 | countreg.countregFitModel / |
| 3 | TABLE={name='d_counts'}, |
| 4 | class={'Sex'}, |
| 5 | model={ |
| 6 | depVars={{name='NumInjur'}}, |
| 7 | effects={{vars={'x1', 'x2'}}}, |
| 8 | modeloptions={modeltype='ZINB'} |
| 9 | }, |
| 10 | zeromodel={ |
| 11 | effects={{vars={'Sex'}}} |
| 12 | }, |
| 13 | OUTPUT={casout={name='out_zinb', replace=true}, pred='Predicted', probzero='P_Zero'}; |
| 14 | RUN; |
An auto insurance provider wants to estimate the expected number of claims per policyholder based on driver age, vehicle type, and region. This helps in adjusting premium rates ...
A large e-commerce platform analyzes user sessions to predict the number of items purchased. Since 95% of browsing sessions result in zero purchases, a Zero-Inflated Negative Bi...
A manufacturing plant tracks defects per batch. However, sensors often fail, leading to missing values (NULLs) in the temperature and pressure readings. The team wants to use LA...