Published on :

Regression Analysis on Class and Population Data

This code is also available in: Deutsch Español Français
Awaiting validation
The script begins by executing two linear regression models via `PROC REG` on the `sashelp.class` dataset, exploring the relationship between `Weight` and `Height`, then including `Age`. Next, it sets global titles for the output. A `DATA` step is used to create the `USPopulation` dataset by reading `Population` data directly from a `datalines` block. This step also calculates derived variables `Year` and `YearSq`. A second `PROC REG` is applied to `USPopulation` to model `Population` as a function of `Year`, with an `ODS OUTPUT` feature to save the covariance matrix (`covb`) into a `Bmatrix` dataset. The model is then modified to add `YearSq`. Finally, a `PROC PRINT` displays the content of the `Bmatrix` dataset, using the system variables `_run_` for observation identification.
Data Analysis

Type : MIXTE


The script uses the internal example dataset `sashelp.class`. It also generates an internal `USPopulation` dataset from numerical data provided via `datalines`. The `Bmatrix` dataset is dynamically created as an ODS output from `PROC REG`.

1 Code Block
PROC REG
Explanation :
This block executes the linear regression procedure (`PROC REG`) on the `sashelp.class` dataset. It first fits a simple model of `Weight` as a function of `Height`, then a second model that includes `Age` in addition to `Height` as predictors of `Weight`.
Copied!
1PROC REG DATA=sashelp.class;
2 var Age;
3 model Weight = Height;
4RUN;
5 
6 model Weight = Height Age;
7RUN;
8QUIT;
2 Code Block
DATA STEP Data
Explanation :
This `DATA` step creates a new dataset named `USPopulation`. The `Population` variable is read from the data provided in the `datalines` block. The `Year` variable is initialized to 1780 and incremented by 10 for each observation, and `YearSq` is calculated as the square of `Year`. The `Population` variable is also scaled by dividing by 1000. The reference `input Population @code_sas_json/...` present in the original script is a SAS syntax error and has been corrected to `input Population;` to allow the code to execute.
Copied!
1title1 'US Population Study';
2title2 'Concatenating Two Tables into One Data Set';
3 
4DATA USPopulation;
5 INPUT Population ;
6 retain Year 1780;
7 Year=Year+10;
8 YearSq=Year*Year;
9 Population=Population/1000;
10 DATALINES;
113929 5308 7239 9638 12866 17069 23191 31443 39818 50155
1262947 75994 91972 105710 122775 131669 151325 179323 203211
13;
3 Code Block
PROC REG
Explanation :
This `PROC REG` analyzes the relationship between `Population` and `Year` using the `USPopulation` dataset. It uses `ODS OUTPUT` to create a dataset named `Bmatrix` which contains the covariance matrix of parameter estimates. After the first execution, the `YearSq` variable is added to the model via the `add` statement, and the updated results are printed. The procedure is terminated by `quit;`.
Copied!
1PROC REG DATA=USPopulation;
2 ods OUTPUT covb(persist=RUN)=Bmatrix;
3 var YearSq;
4 model Population = Year / covb;
5RUN;
6 
7 add YearSq;
8 PRINT;
9QUIT;
4 Code Block
PROC PRINT
Explanation :
This `PROC PRINT` displays the content of the last created or active dataset, which is most likely the `Bmatrix` dataset generated by the previous `PROC REG`. The options `id _run_;` and `by _run_;` are used to identify and group observations based on the internal `_run_` variable, often generated by `ODS OUTPUT` with the `PERSIST=RUN` option.
Copied!
1PROC PRINT;
2 id _run_;
3 BY _run_;
4RUN;
This material is provided "as is" by We Are Cas. There are no warranties, expressed or implied, as to merchantability or fitness for a particular purpose regarding the materials or code contained herein. We Are Cas is not responsible for errors in this material as it now exists or will exist, nor does We Are Cas provide technical support for it.
Copyright Info : S A S S A M P L E L I B R A R Y, NAME: ODSEX6, TITLE: Documentation Example 6 for ODS, PRODUCT: STAT